Whether you use the Sphinx API or SphinxQL, if your Sphinx server is configured properly, you can search and filter by a myriad of indices. But one big obstacle is Sphinx doesn’t offer an OR comparison operator.
THE CHALLENGE
I want a search result set that shows all PUBLIC records and excludes PRIVATE records except for those users that have permission to see certain PRIVATE records (like owners of the record or a peer-to-peer share scenario).
the sphinx.conf might look like:
source srcRecords : srcDB { sql_query_range = SELECT MIN(id),MAX(id) \ FROM records_sphinx_index /*sphinx*/ sql_query = SELECT id, id as record_id, name, \ user_id, status FROM records_sphinx_index \ WHERE id>=$start AND id<=$end /*sphinx*/ sql_attr_uint = record_id sql_attr_uint = user_id sql_attr_string = name sql_attr_uint = status }
A SQL query to retrieve all public records and only private records that belong to users 1,2 and 3 would ideally look like:
SELECT record_id, name FROM table WHERE (user_id in (1,2,3) AND status='PRIVATE') OR (status = 'PUBLIC');
But alas, Sphinx does NOT support OR, and the above query is an invalid SphinxQL statement. Therefore, I have to take a different approach.
THE SOLUTION
If I create a multi-valued attribute named visible_to that contains a list of user_ids for those who have permission to see the record, then I could query on one field instead of two eliminating the need for an OR.
All PUBLIC records would have visible_to value = “0” and PRIVATE records would have a list of user_ids of those who can access (excluding the value “0” because they are not PUBLIC records). The owner and any users the owner has granted access would be the multi-values. visible_to value = “1,2,3”
Here’s what the updated excerpt from the sphinx.conf might look like:
source srcRecords : srcDB { sql_query_pre = SET SESSION query_cache_type=OFF sql_query_range = SELECT MIN(id),MAX(id) \ FROM records_sphinx_index /*sphinx*/ sql_query = SELECT id, id as record_id, name, \ visible_to FROM records_sphinx_index \ WHERE id>=$start AND id<=$end /*sphinx*/ sql_attr_uint = record_id sql_attr_string = name sql_attr_multi = uint visible_to from field }
Now we can querying Sphinx for public records and private records that user_id 3 is allowed to see with a valid SphinxQL statement that looks like:
SELECT record_id, name FROM table WHERE visible_to IN (0,3);
IMPORTANT: You’ll see that I used SOURCE-TYPE of ‘field’ in my sql_attr_multi definition. I did this because there was not a subquery that could accomplish populating both a 0 for PUBLIC records and getting rows for users that have permission to see.
Sphinx documentation doesn’t give examples nor explain how to use SOURCE-TYPE ‘field’. After experimenting, I discovered a simple comma delimited list of ids populated in the the visible_to column of my record_sphinx_index table (the one being indexed) does the trick!
THE TAKE-AWAY
I’m not trying to explain how I populated the visible_to field, rather, I’m sharing my discovery that you can use sql_attr_multi with a SOURCE-TYPE = ‘field’ as long as the referenced field contains a comma delimited list of values.