Pixelinx Framework Open Source PHP Framework

6Dec/090

Configuration, URLs/Slugs/Routing and Solr

Once again I'm going to use this blog to brain-dump some thoughts I've had over the past week or so.

Firstly, configuration. I mentioned before that I need to identify a policy for what configuration gets stored in the database, and what gets stored in the form of text files. I've since realised that I don't need to do anything fancy just yet. The main problems I want to solve are:

  • Create a simple configuration format, even more basic than YAML, but retain the ability to cascade and reference previous nodes (where YAML uses the & and * syntax). This would radically increase parse-time while maintaining minimal file sizes.
  • Using the CMS, have the ability to: configure meta data for SEO; choose how to display content (e.g. limit per-page, CSS classes, etc.); and manipulate widget positions.

Being able to change too much on the fly on a busy production environment is far from sensible. Wordpress (and many others!) goes for this approach but I'm sure this isn't a problem for 99% of their target market (personal blogs and small-medium business sites). How I personally prefer to work is to make such changes in a safe and private environment, thoroughly test and then explicitly deploy those changes to production. Therefore, for the first version at least, the framework will not provide access to the application configuration via the CMS.

Secondly, I've decided that my URL generation problem is now solved. In addition, all URLs will be 'routed' via a configuration file, including node-based content. That might not make so much sense right now but hopefully it will once things progress.

Finally, and probably the biggest issue I'm still faced with: node storage and retrieval. Tackling the EAV model and maintaining high performance is something that nobody has truely acheived yet - at least I've not come across anything impressive. Doing some (a lot) of research, I've found that Magento is supposidly the most successful attempt at one - yet that's a mess.

The use of the coalesce function in MySQL sounded great. Store all attribute values in a single table and let MySQL pull the right value from the correct data-type column. However, how would you perform searches on multiple attributes in a single query (e.g. SELECT ... FROM ... WHERE age = 50 AND name LIKE 'Fred%'). Don't even suggest multiple select/joins. Unless of course you have a few of these rooms at your disposal.

So I did some more hunting around.

I initially thought Solr would be the ideal candidate. However, I'm still not entirely sure yet how well it can cope with large indexes. Perhaps I can optimise the index by only indexing some content and do full hydration of nodes using MySQL based on the Solr result set (a Solr/MySQL hybrid). Additionally, I don't really like the idea of having dependancies on non-standard services (Solr is a Java application requiring Tomcat, or similar). That's when I found out that Zend had the same opinion. They went and built there own Lucene port, entirely in PHP. Zend_Search_Lucene isn't as good as Solr - it doesn't provide the amount of features I'd like and it's not scalable.

Interestingly enough, I've just found a blog post from July 2008 where the author has suggested a similar system to the one I'm proposing here.

So, next step is to get Solr installed again and start doing some performance testing.

Filed under: General No Comments