sub_filter evaluates query_string in match_column context.
sub_filter can be used in only --filter in select.
sub_filter requires two arguments - match_column and query_string.
sub_filter(match_column, query_string)
Here are a schema definition and sample data to show usage.
Sample schema:
Execution example:
table_create Comment TABLE_PAT_KEY UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment name COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment content COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Blog TABLE_PAT_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog title COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog content COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog comments COLUMN_VECTOR Comment
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment blog_comment_index COLUMN_INDEX Blog comments
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Lexicon TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon comment_content COLUMN_INDEX|WITH_POSITION Comment content
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon comment_name COLUMN_INDEX|WITH_POSITION Comment name
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon blog_content COLUMN_INDEX|WITH_POSITION Blog content
# [[0, 1337566253.89858, 0.000355720520019531], true]
Sample data:
Execution example:
load --table Comment
[
{"_key": 1, "name": "A", "content": "groonga"},
{"_key": 2, "name": "B", "content": "groonga"},
{"_key": 3, "name": "C", "content": "rroonga"},
{"_key": 4, "name": "A", "content": "mroonga"},
]
# [[0, 1337566253.89858, 0.000355720520019531], 4]
load --table Blog
[
{"_key": "groonga's blog", "content": "content of groonga's blog", comments: [1, 2, 3]},
{"_key": "mroonga's blog", "content": "content of mroonga's blog", comments: [2, 3, 4]},
{"_key": "rroonga's blog", "content": "content of rroonga's blog", comments: [3]},
]
# [[0, 1337566253.89858, 0.000355720520019531], 3]
Execution example:
select Blog --output_columns _key --filter "comments.name @ \"A\" && comments.content @ \"groonga\""
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 2
# ],
# [
# [
# "_key",
# "ShortText"
# ]
# ],
# [
# "groonga's blog"
# ],
# [
# "mroonga's blog"
# ]
# ]
# ]
# ]
When executing above query, not only "groonga's blog", but also "mroonga's blog". This is not what you want because user "A" does not mention "groonga" to "mroonga's blog".
Without sub_filter, it means that following conditions are met.
Execution example:
select Blog --output_columns _key --filter 'sub_filter(comments, "name @ \\"A\\" && content @ \\"groonga\\"")'
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 1
# ],
# [
# [
# "_key",
# "ShortText"
# ]
# ],
# [
# "groonga's blog"
# ]
# ]
# ]
# ]
On the other hand, executing above query returns the intended result. Because the arguments of sub_filter is evaluated in comments column's context.
It means that sub_filter requires following condition is met.
There are two required parameter, match_column and query_string.
It specifies match_column equivalent parameter.
See match_columns about match_column.
sub_filter returns a value of boolean (true or false).