Desmond  Gerber

Desmond Gerber

1670058133

Introduction to Indexes for MongoDB Atlas Search

Imagine reading a long book like “A Song of Fire and Ice,” “The Lord of the Rings,” or “Harry Potter.” Now imagine that there was a specific detail in one of those books that you needed to revisit. You wouldn’t want to search every page in those long books to find what you were looking for. Instead, you’d want to use some sort of book index to help you quickly locate what you were looking for. This same concept of indexing content within a book can be carried to MongoDB Atlas Search with search indexes.

Atlas Search makes it easy to build fast, relevant, full-text search on top of your data in the cloud. It’s fully integrated, fully managed, and available with every MongoDB Atlas cluster running MongoDB version 4.2 or higher.

Correctly defining your indexes is important because they are responsible for making sure that you’re receiving relevant results when using Atlas Search. There is no one-size-fits-all solution and different indexes will bring you different benefits.

In this tutorial, we’re going to get a gentle introduction to creating indexes that will be valuable for various full-text search use cases.

Before we get too invested in this introduction, it’s important to note that Atlas Search uses Apache Lucene. This means that search indexes are not unique to Atlas Search and if you’re already comfortable with Apache Lucene, your existing knowledge of indexing will transfer. However, the tutorial could act as a solid refresher regardless.

Understanding the Data Model for the Documents in the Example

Before we start creating indexes, we should probably define what our data model will be for the example. In an effort to cover various indexing scenarios, the data model will be complex.

Take the following for example:

{
    "_id": "cea29beb0b6f7b9187666cbed2f070b3",
    "name": "Pikachu",
    "pokedex_entry": {
        "red": "When several of these Pokemon gather, their electricity could build and cause lightning storms.",
        "yellow": "It keeps its tail raised to monitor its surroundings. If you yank its tail, it will try to bite you."
    },
    "moves": [
        {
            "name": "Thunder Shock",
            "description": "A move that may cause paralysis."
        },
        {
            "name": "Thunder Wave",
            "description": "An electrical attack that may paralyze the foe."
        }
    ],
    "location": {
        "type": "Point",
        "coordinates": [-127, 37]
    }
}

The above example document is around Pokemon, but Atlas Search can be used on whatever documents are part of your application.

Example documents like the one above allow us to use text search, geo search, and potentially others. For each of these different search scenarios, the index might change.

When we create an index for Atlas Search, it is created at the collection level.

Statically Mapping Fields in a Document or Dynamically Mapping Fields as the Schema Evolves

There are two ways to map fields within a document when creating an index:

  • Dynamic Mappings
  • Static Mappings

If your document schema is still changing or your use case doesn’t allow for it to be rigidly defined, you might want to choose to dynamically map your document fields. A dynamic mapping will automatically assign fields when new data is inserted.

Take the following for example:

{
    "mappings": {
        "dynamic": true
    }
}

The above JSON represents a valid index. When you add it to a collection, you are essentially mapping every field that exists in the documents and any field that might exist in the future.

We can do a simple search using this index like the following:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "thunder",
                "path": ["moves.name"]
            }
        }
    }
]);

We didn’t explicitly define the fields for this index, but attempting to search for “thunder” within the moves array will give us matching results based on our example data.

To be clear, dynamic mappings can be applied at the document level or the field level. At the document level, a dynamic mapping automatically indexes all common data types. At both levels, it automatically indexes all new and existing data.

While convenient, having a dynamic mapping index on all fields of a document comes at a cost. These indexes will take up more disk space and may be less performant.

The alternative is to use a static mapping, in which case you specify the fields to map and what type of fields they are. Take the following for example:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            }
        }
    }
}

In the above example, the only field within our document that is being indexed is the name field.

The following search query would return results:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "pikachu",
                "path": ["name"]
            }
        }
    }
]);

If we try to search on any other field within our document, we won’t end up with results because those fields are not statically mapped nor is the document schema dynamically mapped.

There is, however, a way to get the best of both worlds if we need it.

Take the following which uses static and dynamic mappings:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            },
            "pokedex_entry": {
                "type": "document",
                "dynamic": true
            }
        }
    }
}

In the above example, we are still using a static mapping for the name field. However, we are using a dynamic mapping on the pokedex_entry field. The pokedex_entry field is an object so any field within that object will get the dynamic mapping treatment. This means all sub-fields are automatically mapped, as well as any new fields that might exist in the future. This could be useful if you want to specify what top level fields to map, but map all fields within a particular object as well.

Take the following search query as an example:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "pokemon",
                "path": ["name", "pokedex_entry.red"]
            }
        }
    }
]);

The above search will return results if “pokemon” appears in the name field or the red field within the pokedex_entry object.

When using a static mapping, you need to specify a type for the field or have dynamic set to true on the field. If you only specify a type, dynamic defaults to false. If you only specify dynamic as true, then Atlas Search can automatically default certain field types (e.g., string, date, number).

Atlas Search Indexes for Complex Fields within a Document

With the basic dynamic versus static mapping discussion out of the way for MongoDB Atlas Search indexes, now we can focus on more complicated or specific scenarios.

Let’s first take a look at what our fully mapped index would look like for the document in our example:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            },
            "moves": {
                "type": "document",
                "fields": {
                    "name": {
                        "type": "string"
                    },
                    "description": {
                        "type": "string"
                    }
                }
            },
            "pokedex_entry": {
                "type": "document",
                "fields": {
                    "red": {
                        "type": "string"
                    },
                    "yellow": {
                        "type": "string"
                    }
                }
            },
            "location": {
                "type": "geo"
            }
        }
    }
}

In the above example, we are using a static mapping for every field within our documents. An interesting thing to note is the moves array and the pokedex_entry object in the example document. Even though one is an array and the other is an object, the index is a document for both. While writing searches isn’t the focus of this tutorial, searching an array and object would be similar using dot notation.

Had any of the fields been nested deeper within the document, the same approach would be applied. For example, we could have something like this:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "pokedex_entry": {
                "type": "document",
                "fields": {
                    "gameboy": {
                        "type": "document",
                        "fields": {
                            "red": {
                                "type": "string"
                            },
                            "yellow": {
                                "type": "string"
                            }
                        }
                    }
                }
            }
        }
    }
}

In the above example, the pokedex_entry field was changed slightly to have another level of objects. Probably not a realistic way to model data for this dataset, but it should get the point across about mapping deeper nested fields.

Changing the Options for Specific Mapped Fields

Up until now, each of the indexes have only had their types defined in the mapping. The default options are currently being applied to every field. Options are a way to refine the index further based on your data to ultimately get more relevant search results. Let’s play around with some of the options within the mappings of our index.

Most of the fields in our example use the string data type, so there’s so much more we can do using options. Let’s see what some of those are.

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string",
                "searchAnalyzer": "lucene.spanish",
                "ignoreAbove": 3000
            }
        }
    }
}

In the above example, we are specifying that we want to use a language analyzer on the name field instead of the default standard analyzer. We’re also saying that the name field should not be indexed if the field value is greater than 3000 characters.

The 3000 characters is just a random number for this example, but adding a limit, depending on your use case, could improve performance or the index size.

In a future tutorial, we’re going to explore the finer details in regards to what the search analyzers are and what they can accomplish.

These are just some of the available options for the string data type. Each data type will have its own set of options. If you want to use the default for any particular option, it does not need to be explicitly added to the mapped field.

You can learn more about the data types and their indexing options in the official documentation.

Conclusion

You just received what was hopefully a gentle introduction to creating indexes to be used in Atlas Search. To use Atlas Search, you will need at least one index on your collection, even if it is a default dynamic index. However, if you know your schema and are able to create static mappings, it is usually the better way to go to fine-tune relevancy and performance.

To learn more about Atlas Search indexes and the various data types, options, and analyzers available, check out the official documentation.

To learn how to build more on Atlas Search, check out my other tutorials: Building an Autocomplete Form Element with Atlas Search and JavaScript and Visually Showing Atlas Search Highlights with JavaScript and HTML.

Have a question or feedback about this tutorial? Head to the MongoDB Community Forums and let’s chat!

This content first appeared on MongoDB.

Original article source at: https://www.thepolyglotdeveloper.com

#mongodb #search 

What is GEEK

Buddha Community

Introduction to Indexes for MongoDB Atlas Search

Query of MongoDB | MongoDB Command | MongoDB | Asp.Net Core Mvc

https://youtu.be/FwUobnB5pv8

#mongodb tutorial #mongodb tutorial for beginners #mongodb database #mongodb with c# #mongodb with asp.net core #mongodb

Install MongoDB Database | MongoDB | Asp.Net Core Mvc

#MongoDB
#Aspdotnetexplorer

https://youtu.be/cnwNWzcw3NM

#mongodb #mongodb database #mongodb with c# #mongodb with asp.net core #mongodb tutorial for beginners #mongodb tutorial

Desmond  Gerber

Desmond Gerber

1670058133

Introduction to Indexes for MongoDB Atlas Search

Imagine reading a long book like “A Song of Fire and Ice,” “The Lord of the Rings,” or “Harry Potter.” Now imagine that there was a specific detail in one of those books that you needed to revisit. You wouldn’t want to search every page in those long books to find what you were looking for. Instead, you’d want to use some sort of book index to help you quickly locate what you were looking for. This same concept of indexing content within a book can be carried to MongoDB Atlas Search with search indexes.

Atlas Search makes it easy to build fast, relevant, full-text search on top of your data in the cloud. It’s fully integrated, fully managed, and available with every MongoDB Atlas cluster running MongoDB version 4.2 or higher.

Correctly defining your indexes is important because they are responsible for making sure that you’re receiving relevant results when using Atlas Search. There is no one-size-fits-all solution and different indexes will bring you different benefits.

In this tutorial, we’re going to get a gentle introduction to creating indexes that will be valuable for various full-text search use cases.

Before we get too invested in this introduction, it’s important to note that Atlas Search uses Apache Lucene. This means that search indexes are not unique to Atlas Search and if you’re already comfortable with Apache Lucene, your existing knowledge of indexing will transfer. However, the tutorial could act as a solid refresher regardless.

Understanding the Data Model for the Documents in the Example

Before we start creating indexes, we should probably define what our data model will be for the example. In an effort to cover various indexing scenarios, the data model will be complex.

Take the following for example:

{
    "_id": "cea29beb0b6f7b9187666cbed2f070b3",
    "name": "Pikachu",
    "pokedex_entry": {
        "red": "When several of these Pokemon gather, their electricity could build and cause lightning storms.",
        "yellow": "It keeps its tail raised to monitor its surroundings. If you yank its tail, it will try to bite you."
    },
    "moves": [
        {
            "name": "Thunder Shock",
            "description": "A move that may cause paralysis."
        },
        {
            "name": "Thunder Wave",
            "description": "An electrical attack that may paralyze the foe."
        }
    ],
    "location": {
        "type": "Point",
        "coordinates": [-127, 37]
    }
}

The above example document is around Pokemon, but Atlas Search can be used on whatever documents are part of your application.

Example documents like the one above allow us to use text search, geo search, and potentially others. For each of these different search scenarios, the index might change.

When we create an index for Atlas Search, it is created at the collection level.

Statically Mapping Fields in a Document or Dynamically Mapping Fields as the Schema Evolves

There are two ways to map fields within a document when creating an index:

  • Dynamic Mappings
  • Static Mappings

If your document schema is still changing or your use case doesn’t allow for it to be rigidly defined, you might want to choose to dynamically map your document fields. A dynamic mapping will automatically assign fields when new data is inserted.

Take the following for example:

{
    "mappings": {
        "dynamic": true
    }
}

The above JSON represents a valid index. When you add it to a collection, you are essentially mapping every field that exists in the documents and any field that might exist in the future.

We can do a simple search using this index like the following:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "thunder",
                "path": ["moves.name"]
            }
        }
    }
]);

We didn’t explicitly define the fields for this index, but attempting to search for “thunder” within the moves array will give us matching results based on our example data.

To be clear, dynamic mappings can be applied at the document level or the field level. At the document level, a dynamic mapping automatically indexes all common data types. At both levels, it automatically indexes all new and existing data.

While convenient, having a dynamic mapping index on all fields of a document comes at a cost. These indexes will take up more disk space and may be less performant.

The alternative is to use a static mapping, in which case you specify the fields to map and what type of fields they are. Take the following for example:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            }
        }
    }
}

In the above example, the only field within our document that is being indexed is the name field.

The following search query would return results:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "pikachu",
                "path": ["name"]
            }
        }
    }
]);

If we try to search on any other field within our document, we won’t end up with results because those fields are not statically mapped nor is the document schema dynamically mapped.

There is, however, a way to get the best of both worlds if we need it.

Take the following which uses static and dynamic mappings:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            },
            "pokedex_entry": {
                "type": "document",
                "dynamic": true
            }
        }
    }
}

In the above example, we are still using a static mapping for the name field. However, we are using a dynamic mapping on the pokedex_entry field. The pokedex_entry field is an object so any field within that object will get the dynamic mapping treatment. This means all sub-fields are automatically mapped, as well as any new fields that might exist in the future. This could be useful if you want to specify what top level fields to map, but map all fields within a particular object as well.

Take the following search query as an example:

db.pokemon.aggregate([
    {
        "$search": {
            "text": {
                "query": "pokemon",
                "path": ["name", "pokedex_entry.red"]
            }
        }
    }
]);

The above search will return results if “pokemon” appears in the name field or the red field within the pokedex_entry object.

When using a static mapping, you need to specify a type for the field or have dynamic set to true on the field. If you only specify a type, dynamic defaults to false. If you only specify dynamic as true, then Atlas Search can automatically default certain field types (e.g., string, date, number).

Atlas Search Indexes for Complex Fields within a Document

With the basic dynamic versus static mapping discussion out of the way for MongoDB Atlas Search indexes, now we can focus on more complicated or specific scenarios.

Let’s first take a look at what our fully mapped index would look like for the document in our example:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string"
            },
            "moves": {
                "type": "document",
                "fields": {
                    "name": {
                        "type": "string"
                    },
                    "description": {
                        "type": "string"
                    }
                }
            },
            "pokedex_entry": {
                "type": "document",
                "fields": {
                    "red": {
                        "type": "string"
                    },
                    "yellow": {
                        "type": "string"
                    }
                }
            },
            "location": {
                "type": "geo"
            }
        }
    }
}

In the above example, we are using a static mapping for every field within our documents. An interesting thing to note is the moves array and the pokedex_entry object in the example document. Even though one is an array and the other is an object, the index is a document for both. While writing searches isn’t the focus of this tutorial, searching an array and object would be similar using dot notation.

Had any of the fields been nested deeper within the document, the same approach would be applied. For example, we could have something like this:

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "pokedex_entry": {
                "type": "document",
                "fields": {
                    "gameboy": {
                        "type": "document",
                        "fields": {
                            "red": {
                                "type": "string"
                            },
                            "yellow": {
                                "type": "string"
                            }
                        }
                    }
                }
            }
        }
    }
}

In the above example, the pokedex_entry field was changed slightly to have another level of objects. Probably not a realistic way to model data for this dataset, but it should get the point across about mapping deeper nested fields.

Changing the Options for Specific Mapped Fields

Up until now, each of the indexes have only had their types defined in the mapping. The default options are currently being applied to every field. Options are a way to refine the index further based on your data to ultimately get more relevant search results. Let’s play around with some of the options within the mappings of our index.

Most of the fields in our example use the string data type, so there’s so much more we can do using options. Let’s see what some of those are.

{
    "mappings": {
        "dynamic": false,
        "fields": {
            "name": {
                "type": "string",
                "searchAnalyzer": "lucene.spanish",
                "ignoreAbove": 3000
            }
        }
    }
}

In the above example, we are specifying that we want to use a language analyzer on the name field instead of the default standard analyzer. We’re also saying that the name field should not be indexed if the field value is greater than 3000 characters.

The 3000 characters is just a random number for this example, but adding a limit, depending on your use case, could improve performance or the index size.

In a future tutorial, we’re going to explore the finer details in regards to what the search analyzers are and what they can accomplish.

These are just some of the available options for the string data type. Each data type will have its own set of options. If you want to use the default for any particular option, it does not need to be explicitly added to the mapped field.

You can learn more about the data types and their indexing options in the official documentation.

Conclusion

You just received what was hopefully a gentle introduction to creating indexes to be used in Atlas Search. To use Atlas Search, you will need at least one index on your collection, even if it is a default dynamic index. However, if you know your schema and are able to create static mappings, it is usually the better way to go to fine-tune relevancy and performance.

To learn more about Atlas Search indexes and the various data types, options, and analyzers available, check out the official documentation.

To learn how to build more on Atlas Search, check out my other tutorials: Building an Autocomplete Form Element with Atlas Search and JavaScript and Visually Showing Atlas Search Highlights with JavaScript and HTML.

Have a question or feedback about this tutorial? Head to the MongoDB Community Forums and let’s chat!

This content first appeared on MongoDB.

Original article source at: https://www.thepolyglotdeveloper.com

#mongodb #search 

Gordon  Murray

Gordon Murray

1670045585

Synonyms in MongoDB Atlas Search

Sometimes, the word you’re looking for is on the tip of your tongue, but you can’t quite grasp it. For example, when you’re trying to find a really funny tweet you saw last night to show your friends. If you’re sitting there reading this and thinking, “Wow, Anaiya and Nic, you’re so right. I wish there was a fix for this,” strap on in! We have just the solution for those days when your precise linguistic abilities fail you, but you have an idea of what you’re looking for: Synonyms in Atlas Search.

In this tutorial, we are going to be showing you how to index a MongoDB collection to capture searches for words that mean similar things. For the specifics, we’re going to search through content written with Generation Z (Gen-Z) slang. The slang will be mapped to common words with synonyms and as a result, you’ll get a quick Gen-Z lesson without having to ever open TikTok.

If you’re in the mood to learn a few new words, alongside how effortlessly synonym mappings can be integrated into Atlas Search, this is the tutorial for you.

Requirements

There are a few requirements that must be met to be successful with this tutorial:

  • MongoDB Atlas M0 (or higher) cluster running MongoDB version 4.4 (or higher)
  • Node.js
  • A Twitter developer account

We’ll be using Node.js to load our Twitter data, but a Twitter developer account is required for accessing the APIs that contain Tweets.

Load Twitter Data into a MongoDB Collection

Example Tweet Data for Slang Synonyms

Before starting this section of the tutorial, you’re going to need to have your Twitter API Key and API Secret handy. These can both be generated from the Twitter Developer Portal.

The idea is that we want to store a bunch of tweets in MongoDB that contain Gen-Z slang that we can later make sense of using Atlas Search and properly defined synonyms. Each tweet will be stored as a single document within MongoDB and will look something like this:

{
    "_id": 1420091624621629400,
    "created_at": "Tue Jul 27 18:40:01 +0000 2021",
    "id": 1420091624621629400,
    "id_str": "1420091624621629443",
    "full_text": "Don't settle for a cheugy database, choose MongoDB instead 💪",
    "truncated": false,
    "entities": {
        "hashtags": [],
        "symbols": [],
        "user_mentions": [],
        "urls": []
    },
    "metadata": {
        "iso_language_code": "en",
        "result_type": "recent"
    },
    "source": "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Web App</a>",
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": 1400935623238643700,
        "id_str": "1400935623238643716",
        "name": "Anaiya Raisinghani",
        "screen_name": "anaiyaraisin",
        "location": "",
        "description": "Developer Advocacy Intern @MongoDB. Opinions are my own!",
        "url": null,
        "entities": {
            "description": {
                "urls": []
            }
        },
        "protected": false,
        "followers_count": 11,
        "friends_count": 29,
        "listed_count": 1,
        "created_at": "Fri Jun 04 22:01:07 +0000 2021",
        "favourites_count": 8,
        "utc_offset": null,
        "time_zone": null,
        "geo_enabled": false,
        "verified": false,
        "statuses_count": 7,
        "lang": null,
        "contributors_enabled": false,
        "is_translator": false,
        "is_translation_enabled": false,
        "profile_background_color": "F5F8FA",
        "profile_background_image_url": null,
        "profile_background_image_url_https": null,
        "profile_background_tile": false,
        "profile_image_url": "http://pbs.twimg.com/profile_images/1400935746593202176/-pgS_IUo_normal.jpg",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/1400935746593202176/-pgS_IUo_normal.jpg",
        "profile_banner_url": "https://pbs.twimg.com/profile_banners/1400935623238643716/1622845231",
        "profile_link_color": "1DA1F2",
        "profile_sidebar_border_color": "C0DEED",
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_text_color": "333333",
        "profile_use_background_image": true,
        "has_extended_profile": true,
        "default_profile": true,
        "default_profile_image": false,
        "following": null,
        "follow_request_sent": null,
        "notifications": null,
        "translator_type": "none",
        "withheld_in_countries": []
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 0,
    "favorite_count": 1,
    "favorited": false,
    "retweeted": false,
    "lang": "en"
}

The above document model is more extravagant than we need. In reality, we’re only going to be paying attention to the full_text field, but it’s still useful to know what exists for any given tweet.

Now that we know what the document model is going to look like, we just need to consume it from Twitter.

We’re going to use two different Twitter APIs with our API Key and API Secret. The first API is the authentication API and it will give us our access token. With the access token we can get tweet data based on a Twitter query.

Since we’re using Node.js, we need to install our dependencies. Within a new directory on your computer, execute the following commands from the command line:

npm init -y
npm install mongodb axios --save

The above commands will create a new package.json file and install the MongoDB Node.js driver as well as Axios for making HTTP requests.

Take a look at the following Node.js code which can be added to a main.js file within your project:

const { MongoClient } = require("mongodb");
const axios = require("axios");

require("dotenv").config();

const mongoClient = new MongoClient(process.env.MONGODB_URI);

(async () => {
    try {
        await mongoClient.connect();
        const tokenResponse = await axios({
            "method": "POST",
            "url": "https://api.twitter.com/oauth2/token",
            "headers": {
                "Authorization": "Basic " + Buffer.from(`${process.env.API_KEY}:${process.env.API_SECRET}`).toString("base64"),
                "Content-Type": "application/x-www-form-urlencoded"
            },
            "data": "grant_type=client_credentials"
        });
        const tweetResponse = await axios({
            "method": "GET",
            "url": "https://api.twitter.com/1.1/search/tweets.json",
            "headers": {
                "Authorization": "Bearer " + tokenResponse.data.access_token
            },
            "params": {
                "q": "mongodb -filter:retweets filter:safe (from:codeSTACKr OR from:nraboy OR from:kukicado OR from:judy2k OR from:adriennetacke OR from:anaiyaraisin OR from:lauren_schaefer)",
                "lang": "en",
                "count": 100,
                "tweet_mode": "extended"
            }
        });
        console.log(`Next Results: ${tweetResponse.data.search_metadata.next_results}`)
        const collection = mongoClient.db(process.env.MONGODB_DATABASE).collection(process.env.MONGODB_COLLECTION);
        tweetResponse.data.statuses = tweetResponse.data.statuses.map(status => {
            status._id = status.id;
            return status;
        });
        const result = await collection.insertMany(tweetResponse.data.statuses);
        console.log(result);
    } finally {
        await mongoClient.close();
    }
})();

There’s quite a bit happening in the above code so we’re going to break it down. However, before we break it down, it’s important to note that we’re using environment variables for a lot of the sensitive information like tokens, usernames, and passwords. For security reasons, you really shouldn’t hard-code these values.

Inside the asynchronous function, we attempt to establish a connection to MongoDB. If successful, no error is thrown, and we make our first HTTP request.

const tokenResponse = await axios({
    "method": "POST",
    "url": "https://api.twitter.com/oauth2/token",
    "headers": {
        "Authorization": "Basic " + Buffer.from(`${process.env.API_KEY}:${process.env.API_SECRET}`).toString("base64"),
        "Content-Type": "application/x-www-form-urlencoded"
    },
    "data": "grant_type=client_credentials"
});

Once again, in this first HTTP request, we are exchanging our API Key and API Secret with an access token to be used in future requests.

Using the access token from the response, we can make our second request to the tweets API endpoint:

const tweetResponse = await axios({
    "method": "GET",
    "url": "https://api.twitter.com/1.1/search/tweets.json",
    "headers": {
        "Authorization": "Bearer " + tokenResponse.data.access_token
    },
    "params": {
        "q": "mongodb -filter:retweets filter:safe",
        "lang": "en",
        "count": 100,
        "tweet_mode": "extended"
    }
});

The tweets API endpoint expects a Twitter specific query and some other optional parameters like the language of the tweets or the expected result count. You can check the query language in the Twitter documentation.

At this point, we have an array of tweets to work with.

The next step is to pick the database and collection we plan to use and insert the array of tweets as documents. We can use a simple insertMany operation like this:

const result = await collection.insertMany(tweetResponse.data.statuses);

The insertMany takes an array of objects, which we already have. We have an array of tweets, so each tweet will be inserted as a new document within the database.

If you have the MongoDB shell handy, you can validate the data that was inserted by executing the following:

use("synonyms");
db.tweets.find({ });

Now that there’s data to work with, we can start to search it using slang synonyms.

Creating Synonym Mappings in MongoDB

While we’re using a tweets collection for our actual searchable data, the synonym information needs to exist in a separate source collection in the same database.

You have two options for how you want your synonyms to be mapped–explicit or equivalent. You are not stuck with choosing just one type. You can have a combination of both explicit and equivalent as synonym documents in your collection. Choose the explicit format for when you need a set of terms to show up as a result of your inputted term, and choose equivalent if you want all terms to show up bidirectionally regardless of your queried term.

For example, the word “basic” means “regular” or “boring.” If we decide on an explicit (one-way) mapping for “basic,” we are telling Atlas Search that if someone searches for “basic,” we want to return all documents that include the words “basic,” “regular,” and “boring.” But! If we query the word “regular,” we would not get any documents that include “basic” because “regular” is not explicitly mapped to “basic.”

If we decide to map “basic” equivalently to “regular” and “boring,” whenever we query any of these words, all the documents containing “basic,” “regular,” and “boring” will show up regardless of the initial queried word.

To learn more about explicit vs. equivalent synonym mappings, check out the official documentation.

For our demo, we decided to make all of our synonyms equivalent and formatted our synonym data like this:

[
    {
        "mappingType": "equivalent",
        "synonyms": ["basic", "regular", "boring"]  
    },
    {
        "mappingType": "equivalent",
        "synonyms": ["bet", "agree", "concur"]
    },
    {
        "mappingType": "equivalent",
        "synonyms": ["yikes", "embarrassing", "bad", "awkward"]
    },
    {
        "mappingType": "equivalent",
        "synonyms": ["fam", "family", "friends"]
    }
]

Each object in the above array will exist as a separate document within MongoDB. Each of these documents contains information for a particular set of synonyms.

To insert your synonym documents into your MongoDB collection, you can use the ‘insertMany()’ MongoDB raw function to put all your documents into the collection of your choice.

use("synonyms");

db.slang.insertMany([
    {
        "mappingType": "equivalent",
        "synonyms": ["basic", "regular", "boring"]
    },
    {
        "mappingType": "equivalent",
        "synonyms": ["bet", "agree", "concur"]
    }
]);

The use("synonyms"); line is to ensure you’re in the correct database before inserting your documents. We’re using the slang collection to store our synonyms and it doesn’t need to exist in our database prior to running our query.

Create an Atlas Search Index that Leverages Synonyms

Once you have your collection of synonyms handy and uploaded, it’s time to create your search index! A search index is crucial because it allows you to use full-text search to find the inputted queries in that collection.

We have included screenshots below of what your MongoDB Atlas Search user interface will look like so you can follow along:

The first step is to click on the “Search” tab, located on your cluster page in between the “Collections” and “Profiler” tabs.

Find the Atlas Search Tab

The second step is to click on the “Create Index” button in the upper right hand corner, or if this is your first Index, it will be located in the middle of the page.

Create a New Atlas Search Index

Once you reach this page, go ahead and click “Next” and continue on to the page where you will name your Index and set it all up!

Name the Atlas Search Index

Click “Next” and you’ll be able to create your very own search index!

Finalize the Atlas Search Index

Once you create your search index, you can go back into it and then edit your index definition using the JSON editor to include what you need. The index we wrote for this tutorial is below:

{
    "mappings": {
        "dynamic": true
    },
    "synonyms": [
        {
            "analyzer": "lucene.standard",
            "name": "slang",
            "source": {
                "collection": "slang"
            }
        }
    ]
}

Let’s run through this!

{
    "mappings": {
    "dynamic": true
},

You have the option of choosing between dynamic and static for your search index, and this can be up to your discretion. To find more information on the difference between dynamic and static mappings, check out the documentation.

"synonyms": [
    {
        "analyzer": "lucene.standard",
        "name": "slang",
        "source": {
            "collection": "slang"
        }
    }
]

This section refers to the synonyms associated with the search index. In this example, we’re giving this synonym mapping a name of “slang,” and we’re using the default index analyzer on the synonym data, which can be found in the slang collection.

Searching with Synonyms with the MongoDB Aggregation Pipeline

Our next step is to put together the search query that will actually filter through your tweet collection and find the tweets you want using synonyms!

The code we used for this part is below:

use("synonyms");

db.tweets.aggregate([
   {
       "$search": {
           "index": "synsearch",
           "text": {
               "query": "throw",
               "path": "full_text",
               "synonyms": "slang"
           }
       }
   }
]);

We want to search through our tweets and find the documents containing synonyms for our query “throw.” This is the synonym document for “throw”:

{
    "mappingType": "equivalent",
    "synonyms": ["yeet", "throw", "agree"]
},

Remember to include the name of your search index from earlier (synsearch). Then, the query we’re specifying is “throw.” This means we want to see tweets that include “yeet,” “throw,” and “agree” once we run this script.

The ‘path’ represents the field we want to search within, and in this case, we are searching for “throw” only within the ‘full_text’ field of the documents and no other field. Last but not least, we want to use synonyms found in the collection we have named “slang.”

Based on this query, any matches found will include the entire document in the result-set. To better streamline this, we can use a $project aggregation stage to specify the fields we’re interested in. This transforms our query into the following aggregation pipeline:

db.tweets.aggregate([
    {
        "$search": {
            "index": "synsearch",
            "text": {
                "query": "throw",
                "path": "full_text",
                "synonyms": "slang"
            }
        }
    },
    {
        "$project": {
            "_id": 1,
            "full_text": 1,
            "username": "$user.screen_name"
        }
    }
]);

And these are our results!

[
    {
        "_id": 1420084484922347500,
        "full_text": "not to throw shade on SQL databases, but MongoDB SLAPS",
        "username": "codeSTACKr"
    },
    {
        "_id": 1420088203499884500,
        "full_text": "Yeet all your data into a MongoDB collection and watch the magic happen! No cap, we are efficient 💪",
        "username": "nraboy"
    }
]

Just as we wanted, we have tweets that include the word “throw” and the word “yeet!”

Conclusion

We’ve accomplished a ton in this tutorial, and we hope you’ve enjoyed following along. Now, you are set with the knowledge to load in data from external sources, create your list of explicit or equivalent synonyms and insert it into a collection, and write your own index search script. Synonyms can be useful in a multitude of ways, not just isolated to Gen-Z slang. From figuring out regional variations (e.g., soda = pop), to finding typos that cannot be easily caught with autocomplete, incorporating synonyms will help save you time and a thesaurus.

Using synonyms in Atlas Search will improve your app’s search functionality and will allow you to find the data you’re looking for, even when you can’t quite put your finger on it.

If you want to take a look at the code, queries, and indexes used in this blog post, check out the project on GitHub. If you want to learn more about synonyms in Atlas Search, check out the documentation.

If you have questions, please head to our developer community website where the MongoDB engineers and the MongoDB community will help you build your next big idea with MongoDB.

This content first appeared on MongoDB.

Original article source at: https://www.thepolyglotdeveloper.com/

#mongodb #atlas #search 

Logical operator in MongoDB | Filter & Search Documents |CRUD Operation | Asp.Net Core

https://youtu.be/qi-KeuHOj00

#mongodb #mongodb tutorial #mongodb tutorial for beginners #learn mongodb #logical operator in mongodb