Subject Design Question: Storing and using generic text "tags"
Author Jarrod Hollingworth
I'll be adding a tags system to an existing application so that users can
enter one or more tags to help search and categorize various items, similar
to what photo management software or blogs do.

The user will be able to: a) type a comma separated tag list when
adding/updating items, b) perform a search based on the entry of one or more
tags (and/or), and c) generate a report detailing or summarizing items "by
tag".

Ideally the user should not be limited to the number of tags per item but a
reasonable limit would be ok. Ideally the grunt work should be possible
within the DATABASE and not rely on complex code in the client app. Ideally
it should be easy for someone to add new items with tags directly using a
database tool rather than the application.

e.g.
Item1: tags=family,holiday
Item2: tags=holiday,skiing
Item3: tags=skiing,simon,funny

Search "family or simon" would return Item1 and Item2.
Report by tag might list:
family
Item1
funny
Item3
holiday
Item1
Item2
simon
Item3
skiing
Item2
Item3

I figure that there are three main ways to implement this:

1) Have a single TAGS column on the ITEM table containing comma separated
tags.
Store as-typed tags in this column.
Use LIKE for searching (complex for and/or situations)
For the report fetch distinct TAGS, pull apart the individual tags and
remove dups, formulate a complex query to pull back the items in the
appropriate order, OR pull in all ITEM rows and post-process on the client.
This might be ok if the user was forced to enter at least one and up to a
small number of tags to limit the extent of the report.
Perhaps the LIST function added in FB 2.1 would help pull the tags apart.

2) Have a fixed number of TAGn columns (TAG1, TAG2, ...) on the ITEM table,
each storing an individual tag.
Restrict number of tags that can be entered. Pull apart tags on entry and
populate in separate columns.
Use exact matching on each TAGn column for searching (again complex for
and/or situations)
For the report fetch distinct list of values in the TAGn columns,
formulate a complex query to pull back the items in the appropriate order.

3) Have a separate TAG table with a many-to-many mapping table to reference
ITEMs.
Pull apart tags on entry and store new tags in separate TAG rows, create
mapping row for each tag.
Use exact matching on TAG rows for searching, join to ITEM table.
For the report simply join ITEM to TAG through mapping, order by TAG.

Design 1) is a less complex table design and would make it easy to add items
using a database tool (simple INSERT) but the searching and reporting would
be complex and likely slow. Design 3) makes it easy to search/report but
more complex to add items. Perhaps a stored proc could be used to add items,
passing all ITEM columns and a comma-separated tags string and the stored
proc could pull apart the tags and create the TAG rows and mapping rows
after inserting the ITEM. I don't like design 2) but it may be a reasonable
compromise.

Is it easy to process a comma-separated string passed as a parameter to a
stored proc? E.g. Split into separate values, strip leading/trailing
whitespace. Any simple non-udf solution in FB1.5.4 or is FB2.1 or other
required?

Has anyone implemented such a tag system before? How did you do it?

Any thoughts on the above designs or a better design?

Thanks for your input.

Regards,

Jarrod Hollingworth
Complete Time Tracking
http://www.complete-time-tracking.com/