Elasticsearch Queries Studio 2

In this studio you will practice working with indices (deleting, re-indexing) and mappings. You will also run some filtered and geo queries on Elasticsearch.

Loading Data

We will be working with the same data set as yesterday for /tweets. However, we are adding location data for our tweets.

Use the tweets-geo.sh` script to create a new index: /tweets_geo.

Look over the script as it may help you with your tasks.

The script isn’t complete as you will need to figure out the request to create the appropriate mapping. Looking over the baseball-teams-stadium.sh script will be beneficial. It’s the second task in the list below.

Your Tasks

Carry out each of the following tasks. Once you have a successful query for each, save the command in a .txt file for submission.

  1. What is the data type of the location field in the twitter_geo index? What should it be?
  2. Fix the issue with location by editing tweets_geo.sh to explicitly map location as a geo_point field. Hint: To build the JSON that you’ll need to create the new mapping for twitter_geo, you can copy the response from fetching the mapping for twitter and modify it.
  3. Find all tweets with between 5 and 10 likes, inclusive of those endpoints.
  4. Find all tweets by Mary Jones that have at least 2 likes.
  5. Find all tweets that contain the text “Elasticsearch”, including misspellings up to distance of 2 away.
  6. Find all tweets that have a location (Hint: Try the exists query https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html)
  7. Find all tweets that do not have a location
  8. Find the average number of likes for tweets that have a location
  9. Find the average number of likes for tweets that do not have a location
  10. Find all tweets with locations within 500km of Boise, ID