Mongodb Products schema with proper indexes

indexmongodb

I'm totally new to mongodb with a MYSQL background.

I'm working on an e-commerce website with some millions of products
So we decided to migrate from MYSQL to mongodb.

So far this is the products structure that i came up with as a final schema

{
    id: 12, // filterable + sortable 
    sku: 'some-sku', // filterable
    status: 'enabled', // filterable by default to get enabled products only 
    createdAt: {
        date: '22-11-2018',
        timestamp: 41231312
    },
    locales: {
        en: {
            title: 'Product Title', // searchable + sortable
            description: 'Some Rich description', 
            shortDescription: 'Some short description',// searchable
            category: 'Category name',
            brand: 'Brand name',
            tags: ['tag1', 'tag2', 'tag3'],// searchable
            attributes: {
                // filterable
                color: 'red', 
                size: 'XL' 
            },
            specifications: {
                // list of specifications
            }
        }
    },
    images: {
        main: {
            large: 'path-to-image',
            medium: 'path-to-image',
            small: 'path-to-image' 
        },
        all: [
            {
                large: 'path-to-image',
                medium: 'path-to-image',
                small: 'path-to-image'        
            }
        ]
    },
    categoryId: 562, // filterable
    brand: {
        id: 13, // filterable
        logo: 'path-to-log'
    },
    price: {
        original: 200,
        discount: {
            value: 40,
            percentage: 20 // filterable + sortable
        },
        salePrice: 160 // filterable with a min-max price range + sortable
    },
    rating: 4.5, // filterable + sortable
    reviews: [
        {
            customer: {
                id: 644121,
                name: 'Customer Name',
                review: 'Customer review',
                rating: 4, 
                createdAt: {
                    date: '22-11-2018'
                }
            }
        }
    ],
    children: [
        // list of children products same schema
    ],
    similarProducts: [
        // list of similar products ids
    ]
}

Quick notes about the schema:

There are many fields will be used in search/filter like:

  • By default, Only enabled products will be displayed.

  • title, shortDescription and tags as full text search.

  • Also customer can filter products by sku category id, brand id, salePrice, rating and attributes.

Customer can sort products display using any of the following in asc/desc order:

  • Latest/Oldest: using the creation date

  • Price: High to low or Low to high

  • Rating: High to low or Low to high

  • Discount Percentage: High to low or Low to high

So Here are my questions:

1- Based on the earlier info, customer may combine many filters in same time combined with different sort based on his selection, so should i've to create indexes for all use cases?

2- Products could be in more than on category and also any product in child category will be implicitly added to its parent categories recursively

For example if we've the following category tree:

Sports >> Clothes >> Men >> Shoes

if a product in the Shoes category, it should be also in all previous categories.

In MYSQL i had a product_categories table that holds product_id and category_id so i can retrieve directly all products for the current category without looping through all children to get its ids.

Now should i add another field called categoriesList that will contain all the ids of the categories that the product is included in?

Sorry for the long talk and thanks in advance.

Best Answer

Based on your requirements, following is the solutions:

  1. You must find the frequent query areas. At initial lookup it looks like adding an index for each of the use case is good idea. But you may need to revise it over the period of time and observation of index utilization. Following statement from the official document describes the most important point about the cost of indexing:

    Before you build indexes, map out the types of queries you will run so that you can build indexes that reference those fields. Indexes come with a performance cost, but are more than worth the cost for frequent queries on large data set. Consider the relative frequency of each query in the application and whether the query justifies an index.

    You can find more details here.

  2. You are correct on this point.