How does MongoDB Treat Mixed Types

database-designdatatypesmongodb

I'm messing around with mongodb and noticed I can create mixed data types.

db.getSiblingDB("bermuda").getCollection("test").insertOne({
    "profile": {
        "t1": 100
    }
});

db.getSiblingDB("bermuda").getCollection("test").insertOne({
    "profile": {
        "t1": "test"
    }
});

Which results in:
x

But because "t1": 100 was written first, DataGrip thinks that this is an int column. I can query these results both ways.

SELECT * FROM bermuda.test
WHERE profile.t1 = 'test'
;

SELECT * FROM bermuda.test
WHERE profile.t1 = 100
;

SELECT * FROM bermuda.test
WHERE profile.t1 > 0
;

This raises a lot of questions.

  1. Does Mongodb consider this this to be an Int32, String, both, or some kind of "Any" type?
  2. How does this affect indexing?
  3. Is there a way to get a collections "schema" or is it just looking at the first document in that collection?

Best Answer

Schema Validation

You can specify Schema Validation on the document's within a collection to validate the kind of data types you can insert or update for each field. In addition, you can also set other constraints like a required field. This feature will allow you maintain a specific data type for a field. Note that Schema Validation is optional.


Field Types

By default, MongoDB number data types are of type double. There are other number types like int, long, float and NumberDecimal. You can convert the types within your queries as per your requirement. There are various operators for the conversion from one type to another (including conversion of numbers within strings and vice-versa). $convert is one such operator.

There are also operators to check the data type is of specific type, before or during processing.

Query operator $type example, finds documents where the price field type is string:

db.items.find( { price: { $type: "string" } } )

Aggregate operator $type example, gets all documents with a new field telling the data type of a specific field:

db.items.aggregate([
    { $addFields: { price_type: { $type: "$price" } } }
])

Notes:

Having mixed data types is something one comes across due to data feeds from various sources or systems. I suspect you need to handle these as needed - identifying and handling at some stage of the application. Regarding indexing on mixed types, I think there are indexing features you can take advantage of based upon your situation: Collation and Partial Indexes.