[1]:
import mdf_toolbox

Globus Search Utilities

The MDF Toolbox provides a few utilities to make integrating with Globus Search easier.

format_gmeta

format_gmeta() takes a dictionary of data you want to change into the Globus Search GMeta format and returns the GMetaEntry of that dictionary. It is required to provide the acl (Access Control List, or ["public"] for public data) and identifier (unique ID for this entry, or an existing ID to overwrite).

To make a GIngest (the final form of Globus Search ingests), provide a list of GMetaEntry objects.

[2]:
my_data = {
    "foo": "bar",
    "baz": [1, 2, 3, 4]
}
gmeta_entry = mdf_toolbox.format_gmeta(my_data,
                                       acl=["public"],
                                       identifier="abc123")
[3]:
gmeta_entry
[3]:
{'@datatype': 'GMetaEntry',
 '@version': '2016-11-09',
 'content': {'baz': [1, 2, 3, 4], 'foo': 'bar'},
 'subject': 'abc123',
 'visible_to': ['public']}
[4]:
list_of_gmeta_entry = [gmeta_entry]
g_ingest = mdf_toolbox.format_gmeta(list_of_gmeta_entry)
[5]:
g_ingest
[5]:
{'@datatype': 'GIngest',
 '@version': '2016-11-09',
 'ingest_data': {'@datatype': 'GMetaList',
  '@version': '2016-11-09',
  'gmeta': [{'@datatype': 'GMetaEntry',
    '@version': '2016-11-09',
    'content': {'baz': [1, 2, 3, 4], 'foo': 'bar'},
    'subject': 'abc123',
    'visible_to': ['public']}]},
 'ingest_type': 'GMetaList'}
[6]:
# globus_sdk.SearchClient.ingest(index, g_ingest)

gmeta_pop

gmeta_pop() takes the results from a Globus Search query and unwraps them from the GMeta format. You can pass in a GlobusHTTPResponse from the SearchClient, a JSON-dumped string, or a dictionary.

[7]:
sample_search_result = {
    '@datatype': 'GSearchResult',
    '@version': '2016-11-09',
    'count': 11,
    'gmeta': [{
        '@datatype': 'GMetaResult',
        '@version': '2016-11-09',
        'content': [{
            "foo": "bar",
            "baz": [1, 2, 3, 4, 5]
        }, {
            "food": "bard",
            "bazd": ["d"]
        }],
        'subject': "http://example.com/abc123",
    }],
    'offset': 0,
    'total': 22
}
[8]:
mdf_toolbox.gmeta_pop(sample_search_result)
[8]:
[{'baz': [1, 2, 3, 4, 5], 'foo': 'bar'}, {'bazd': ['d'], 'food': 'bard'}]

If you want the metadata associated with your query (total number of query matches), you can use info=True to get a tuple of (results, metadata).

[9]:
mdf_toolbox.gmeta_pop(sample_search_result, info=True)
[9]:
([{'baz': [1, 2, 3, 4, 5], 'foo': 'bar'}, {'bazd': ['d'], 'food': 'bard'}],
 {'total_query_matches': 22})

translate_index

Globus Search requires or strongly encourages users to query using an index’s UUID instead of the index’s name. translate_index() takes the index name and returns the UUID (if found, otherwise it returns the input back).

[10]:
mdf_toolbox.translate_index("mdf")
[10]:
'1a57bbe5-5272-477f-9d31-343b8258b7a5'
[ ]: