Home | Registration | Catalog | Resources | Hosting | Services | Support | Contact Us | Downloads | Press Room | Search

Give Your Web Site A Professional Makeover Quickly And Easily DPA Software
Search PDF Files  

 

Subscribe to our

Newsletter

Email Address

 

Login

Bookmark this Page
 

Up
Shared Border Tip
Resizing Images
Colorize Template
Sound Editing
Non-FP Web Applets
FP Rollover Buttons
Shared Border Info
Dynamic Navigation
Page Redirection
Using FX Templates
Designing STs
Theme Attributes
Picture Bullets
Multiple STs
STS Super Themes
Using FX Splash Page
Frames and Effects
Replace Component
Matching Colors
Button Bar Helper
Auto Sizing
Go To Slide
Screen Capture
Media Maker
Link Bars
Tip of the Day
PFX Inline Frames
Customize Splash
Background Image
Search Engine
Quick Search
DWT Web
FP2003 FAQs
Spacer GIFs
Apache Splash Pages
Borders Meta Tag
SSE Template Pages
Search PDF Files
_frontlook Folder
Auto Backups
FP Templates
SSE Clear Button
SSE Area Menu
SSE Optimization

 

Downloads
FREE STUFF
Press Room
Search

Shopping Cart
 

No Items 

 

Check Out

 

 

Need FrontPage?

 

Great Price!

 

FrontLook.com Newsletter Tip

Volume 4 - Issue 1 - April 2004

Searching .PDF and other files with the FrontLook Site Search Engine
By David Pfeiffer of FrontLook.com
There are many situations where you would like to search documents other than .htm that are viewable in a browser, for example; PDF, Word and PowerPoint files. While the current version of the FrontLook site Search engine does not directly search these files, a simple workaround can make it do just that. In this tip, a technique is outlined to enable the search a PDF file and bring up the page in the browser if it selected by the user. This tip assumes that you have already setup the FrontLook Site Search Engine on your web site

The Basic Idea: the Proxy Search Page
Since the FrontLook Site Search Engine (SSE) can only search web pages, we will convert the .pdf or .doc file to text and place the text on a "proxy" web page. This proxy web page will be added to the appropriate Search Areas. A redirect meta tag will be added to the proxy page as well so that when the proxy page is selected, it will immediately take the browser to the actual document.

In Action
When the user searches for a keyword, the proxy page is searched and referenced in the results list. If the user clicks on the proxy page result link, a proxy page redirect will bring up the actual document in a browser.

Step 1: Prepare the Proxy Page
The technique illustrated on a FrontLook Java Effects .pdf file. First the file is places on the web site in a known location and the searchable text in the document must be extracted. This can be done in two ways:

bulletBring up the PDF in Acrobat and use the Acrobat text tool to extract the text from the document. Note: If your document is formatted in two column, the Acrobat text tool tends to jumble up the text.
bulletBring up the original document and extract the text or save the document in a plain text format. Note: Many document processors support the saving of the document in HTML, but sometimes you get the footers and headers mixed in with the text. In this case turn off the footers before you save the file.

We recommend that you get the text from the original document if possible as it will be the most reliable.

Once you have the text, bring up a blank web page and paste the text. Remove any formatting using the "Format->Remove Formatting..." menu as it just slows down the search. You can also remove any words you don't want to search engine to find or you can add meta tag keywords related to the subject matter.

Now save the page onto the web giving is a name that encapsulates the original document name. For example: the original document is called "JFXProductFlyer.pdf" and therefore the proxy page will be called "JFXProductFlyer.pdf.htm"

Step 2: Setup the Redirect to the Original Document
There are two ways to setup the redirect; 1) using the Refresh Meta tag and 2) use an ASP redirect script.

Meta Tag Redirect
To setup the redirect to the original document, right click on the page and select the "Page Properties..." menu item. Then select the "Custom" tab. Press the "System Variables (HTTP-EQUIV)" section's "Add..." button. 


System Meta Variable Dialog

Now add the name and value text to the System-Meta Variable Dialog as illustrated above and press the OK button.

Where:

"0" is the number of seconds to wait before proceeding to the URL.

"URL=JFXProductFlyer.pdf" is the URL of the original document to be loaded in the browser. Replace the "JFXProductFlyer.pdf" with your document URL. This URL can be a relative URL such as "docs/mypdf.pdf" or an absolute URL such as "http://www.mydomain.com/docs/mypdf.pdf".

Here is what the meta tag looks like in the HTML view in case you want to add it to your HTML (must be between the <head> and </head> tags).

<meta http-equiv="Refresh" content="0;URL=JFXProductFlyer.pdf">

Now view the proxy page in the browser to make sure the redirect is working. The original document should come up in the browser after a short delay. The problem with the meta tag redirect approach is two fold; 1) to return to the search page, you have to press the back button twice and 2) the page appears to flash as the redirect takes place. Both of these side effects can be overcome using the Active Server Pages (ASP) redirect statement discussed in the next section.

ASP Redirect
To get rid of the page flash and back button problem, you can rename the .htm proxy page into an .asp page and add the Redirect statement to the page. Note: your web site must be hosted on a Microsoft Server for ASP pages to operate.

The redirect statement looks like this in HTML mode:

<% Response.Redirect "JFXProductFlyer.pdf" %>

Where:

<% - starts the script

Response.Redirect - is the redirect command

"JFXProductFlyer.pdf" - is the URL of the original document to be loaded in the browser. Replace the "JFXProductFlyer.pdf" with your document URL. This URL can be a relative URL such as "docs/mypdf.pdf" or an absolute URL such as "http://www.mydomain.com/docs/mypdf.pdf".

%> - ends the script

This statement must appear before any HTML statements, usually the very first thing on the page. Note: this statement can only been viewed and edited in HTML mode.

Step 3: Add the Proxy Page to the Search Areas
To add your proxy page to one or more search areas, bring your web up in FrontPage and select the "Insert->FrontLook Site Search Engine->Edit Search Areas..." menu item.

The Edit Search Areas window will appear.


Edit Search Areas window

Add the proxy page or pages to the desired search area or areas.

And That's it!
Now you are ready to test the search, in fact you can try it here:

Search FrontLook PDFs:   Press the search button

The same principle can be applied to other file formats such as Word and Power Point files.

 

DPA Software

Copyright 2004 DPA Software - All Rights Reserved

Last Modified : 03/06/2008 11:15 PM

We accept the following payment forms
POs and Checks and PayPal accepted
Privacy
Statement
Questions or Problems?
 Click here
Download Policy
Secured by SSL Certificate

All products come with a

FrontLook, Theme Chameleon, Image Chameleon and FrontLook Super Themes are trademarks of DPA Software. Microsoft FrontPage, SharePoint, Microsoft and the Office logo are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Java is a trademark of Sun Microsystems. *PC Magazine is a registered trademark of Ziff Davis Publishing Holdings Inc. Used under license from Ziff Davis Media Inc.