Discussion:
Jackrabbit - storage documents ( file system / Mysql)
khelifa senoussi Elhadj
13 years ago
Permalink
Hi,

I'm trying to decide what type of storage for my project wherein I
upload and download several files/documents (pdf, doc, ...) using
Apache.Jackrabbit.
Please can you tell me which kind of store can I use to avoid
performance problems:
1- File system
2- Mysql


Cordially,
Elhadj
Francisco Carriedo Scher
13 years ago
Permalink
Hi there,

i think that it depends on the size of the files you are going to store (no
idea about large number of files). If your files are under the 1MB size use
MySQL as it performs well within that range.

Use a filesystem otherwise.

This is just my perception, any other opinions?

Regards.
Hi,
I'm trying to decide what type of storage for my project wherein I upload
and download several files/documents (pdf, doc, ...) using
Apache.Jackrabbit.
Please can you tell me which kind of store can I use to avoid performance
1- File system
2- Mysql
Cordially,
Elhadj
Mark Herman
13 years ago
Permalink
Agreed, in general it is accepted that the filesystem is going to be faster.
Be aware of premature optimization though. There may be features given by
the overhead of a DB that you'd want, but give it up for a difference that
the users will never notice.

Also, make sure you avoid high number of children under one node if
performance is important. They recommend less than 10k per node, but I'd
structure it so you're never even close to that limit.

--
View this message in context: http://jackrabbit.510166.n4.nabble.com/Jackrabbit-storage-documents-file-system-Mysql-tp4474715p4476719.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
Alexander Klimetschek
13 years ago
Permalink
Is the original question about the DataStore (for binaries/files) or generally about the persistence manager?

For (larger) binaries the file system based datastore will be a lot more efficient than databases, as relational DBs are usually not made for binaries and the overhead they introduce does not give any benefit for binaries, since you can't query for them (Jackrabbit has a separate full text search index).

For the persistence manager, which also stores all the fine granular data such as small properties, this is very different. The only simple file-system based PM was the XMLPersistenceManager, which is very, very inefficient. The Bundle DB PMs are the most efficient ones in Jackrabbit (but note that they also don't use much of the DB queries, they only ask for node bundles based on the node's UUID as primary key, that's all).

Cheers,
Alex
...
--
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel
Loading...