Just some tidbits regarding PLONE.
In no way I pretend to be publishing autoritative information as in the Plone site: not at all. I just
wanted to put on-line my every-day little discoveries while I am working on Fefsi new
site.
Plone can index standard "File" content in PDF form. The problem aise when you want to build your very own
"File-based" content, the following is a resume of my findings, it could be superseded by future releases of
Archetypes (I am using release 1.2.5).
I will mainly focus on the programmatical side here, the necessary step to have a working configuration will be soon
added in another item of this list (they are nevertheless platform-dependent, your mileage may vary).
First of all, you need TextIndexNG, this is an alternate
indexer integrated with written by Andreas Jung: it's highly customizable, multilingual and has relevance ranking, last, but not least
has converters for PDF, HTML, Powerpoint, Word and Postscript documents...
NB: you will need Xpdf and, more specifically, the shell command "pdftotext" along the support X libraries to be able to fully transform a PDF document in text.
Supposing everything is in place and working let's go immediately to the code. Here is a very simple archetype that has just a description field and a file field where we will store the PDF document:
The code is commented. Most of it should be sufficiently understandeable. As I mentioned in the listings I "borrowed"
code from the Collective from Andreas Jung ATTypes.
Before registering this Archetype we need to perform the following operations on Plone Catalog through the ZMI:
- Open the Portal Transform Tool in the ZMI of your Plone site and add a transform with the following params:
ID: pdf_to_text
Module: Products.PortalTransforms.transforms.pdf_to_text
- Add a TextIndexNG index. The index must have the following parameters:
Name: SearchableText
Indexed Attributes: SearchableText
Use Converters: enabled
The index can include other attributes (e.g. PrincipiaSearchSource or your custom definitions..), what is important is that
'SearchableText MUST stay first.
- Regenerate the indexes.... Done.
This happens to be a characteristics of Plone 2.0RC5 and RC6 but seems to be an early Zope2.7 related issue.
You always have the option to upgrade to a mainstream version but in case you can't here is how to let the
missing button reappear:
In the file-system of your machine open the following file:
uncomment the line:
comment the line:
Restart Plone manually (shell or Windows menu)
In short: you make Zope start as a daemon through the zopectl script. Only this way you can have the button working.
I have to clean some messy code before I publish this.. Let's say it will come soon.