Mongodb – Sharded Mongodb stalls randomly

azuremongodbsharding

I have setup Sharded MongoDB cluster using hashed sharding in kuberenetes.I first created the config server Replicaset and then created 2 shard replicasets. Finally created mongos to connect to the sharded cluster.

I followed the below link to setup sharded MongoDB Click https://docs.mongodb.com/manual/tutorial/deploy-sharded-cluster-hashed-sharding/

After creation of mongos,I have enabled sharding for the database and have sharded the collection using the hashed sharding strategy.

After all this setup,I'm able to connect to mongos and have added some data to some of the collections in the database and able to check the distribution of data across different shards.

The issue that I'm facing is when trying to access mongodb from my java spring boot project,the connection stalls randomly.But once the connection is established for a particular query, that particular query won't stall for next few tries.After some idle time if I try to make request again to mongodb,it will again start to stall.

Note : MongoDB is hosted in "DS2 v2" VM and this cluster has 4 nodes.1 for config server,2 for shards and 1 for mongos

In one of the link,they had asked to set proper shard key to all the collections and this will have an impact on the performance of the mongodb.There were couple of things to consider before selecting the right shard key,I had considered all those factors before selecting shard key.I read through this link to select shard key – Click https://www.mongodb.com/blog/post/on-selecting-a-shard-key-for-mongodb

One of the other solution that I came across was that to set the ShardingTaskExecutorPoolMaxConnecting and to limit the rate at which mongos nodes add connectons to connection pools.I tried setting it to 20,5,100,150 and none of this resolved the stalling issue that I'm facing. This is the link – Click https://jira.mongodb.org/browse/SERVER-29237

I tried tweaking other parameters like ShardingTaskExecutorPoolMinSize and taskExecutorPoolSize.Even this did not resolve stalling issue.

I also set –serviceExecutor as adaptive.

Increased the wiredTigerCacheSizeGB from 0.25 to 2.This also dint make any difference to the stalling issue

1) YAML file of service and Deployment for config server of mongodb is –

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-conf-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-conf-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--configsvr"
          - "--replSet"
          - "ConfigDBRepSet"
          image: #MongoImageName
          name: mongo-conf-service
          ports:
          - containerPort: 27017
          resources: {}
          volumeMounts:
          - name: mongo-conf
            mountPath: /data/db
        restartPolicy: Always
        volumes:
          - name: mongo-conf
            persistentVolumeClaim:
              claimName: mongo-conf

2) YAML file of service and Deployment for Shard mongodb is –

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-shard
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-shard
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--shardsvr"
          - "--replSet"
          - "Shard1RepSet"
          image: #MongoImage
          name: mongo-shard
          ports:
          - containerPort: 27017
          resources: {}

3) YAML File of mongos server:

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongos-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongos-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #USername
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
            - "numactl"
            - "--interleave=all"
            - "mongos"
            - "--port"
            - "27017"
            - "--bind_ip"
            - "0.0.0.0"
            - "--configdb"
            - "ConfigDBRepSet/mongo-conf-service:27017"
          image: #MongoImageName
          name: mongos-service
          ports:
          - containerPort: 27017
          resources: {}

The logs of mongos server is :

2019-08-05T05:27:52.942+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:5058 #308807 (79 connections now open)
2019-08-05T05:27:52.964+0000 I ACCESS   [conn308807] Successfully authenticated as principal Assist_Random_Workspace on Random_Workspace from client 10.0.0.0:5058
2019-08-05T05:27:54.267+0000 I NETWORK  [worker-3] end connection 10.0.0.0:52954 (78 connections now open)
2019-08-05T05:27:54.269+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:52988 #308808 (79 connections now open)
2019-08-05T05:27:54.275+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7174 #308809 (80 connections now open)
2019-08-05T05:27:54.279+0000 I ACCESS   [conn308809] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.:7174 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found
2019-08-05T05:27:54.281+0000 I NETWORK  [worker-1] end connection 10.0.0.5:7174 (79 connections now open)
2019-08-05T05:27:54.342+0000 I NETWORK  [worker-1] end connection 10.0.0.6:57391 (78 connections now open)
2019-08-05T05:27:54.343+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57527 #308810 (79 connections now open)
2019-08-05T05:27:55.080+0000 I NETWORK  [worker-3] end connection 10.0.0.0:56021 (78 connections now open)
2019-08-05T05:27:55.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56057 #308811 (79 connections now open)
2019-08-05T05:27:56.054+0000 I NETWORK  [worker-1] end connection 10.0.0.0:59137 (78 connections now open)
2019-08-05T05:27:56.055+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59184 #308812 (79 connections now open)
2019-08-05T05:27:59.268+0000 I NETWORK  [worker-1] end connection 10.0.0.5:52988 (78 connections now open)
2019-08-05T05:27:59.270+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53047 #308813 (79 connections now open)
2019-08-05T05:27:59.343+0000 I NETWORK  [worker-3] end connection 10.0.0.6:57527 (78 connections now open)
2019-08-05T05:27:59.344+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57672 #308814 (79 connections now open)
2019-08-05T05:28:00.080+0000 I NETWORK  [worker-3] end connection 10.0.1.1:56057 (78 connections now open)
2019-08-05T05:28:00.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56116 #308815 (79 connections now open)
2019-08-05T05:28:01.054+0000 I NETWORK  [worker-3] end connection 10.0.0.0:59184 (78 connections now open)
2019-08-05T05:28:01.058+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59225 #308816 (79 connections now open)
2019-08-05T05:28:01.763+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7173 #308817 (80 connections now open)
2019-08-05T05:28:01.768+0000 I ACCESS   [conn308817] SASL SCRAM-SHA-1 authentication failed for Assist_Sharded_Database on Sharded_Database from client 10.0.0.0:7173 ; UserNotFound: User "Assist_Sharded_Database@Sharded_Database" not found
2019-08-05T05:28:01.770+0000 I NETWORK  [worker-3] end connection 10.0.0.0:7173 (79 connections now open)
2019-08-05T05:28:04.271+0000 I NETWORK  [worker-3] end connection 10.0.0.0:53047 (78 connections now open)
2019-08-05T05:28:04.272+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53083 #308818 (79 connections now open)
2019-08-05T05:28:04.283+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7105 #308819 (80 connections now open)
2019-08-05T05:28:04.287+0000 I ACCESS   [conn308819] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.0:7105 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found

Java Code block to connect to MongoDB is –

Note:The below code supports multitenancy of MongoDB at Database level.Based on one of the parameter in every request,we will determine from which database to query from.

The below code will work fine for Standalone MongoDB instance.

1) Application property

mongodb.uri=${mongoURI:mongodb://username:password@IPaddress:portNumber}
mongodb.defaultDatabaseName=assist
assist-server-address1 = IpAddress

2) Spring Boot Application

@SpringBootApplication
@ServletComponentScan
public class ServiceApplication extends RepositoryRestConfigurerAdapter {
    public static void main(String[] args) {
        SpringApplication.run(ServiceApplication.class, args);
    }

    @Autowired
    public MongoDBCredentials mongoDBCredentials;

    @Bean
    public MongoTemplate mongoTemplate() {
        return new MongoTenantTemplate(
                new SimpleMongoDbFactory(new MongoClient(new MongoClientURI(mongoDBCredentials.getUri())),
                        mongoDBCredentials.getDefaultDatabaseName()));
    }

}

3)MongoTemplate -> This is used to establish Mongo connection.We have implemented Multitenant Mongo(To connect to multiple database based on one of the parameter of the request)

public class MongoTenantTemplate extends MongoTemplate {
    private static Map<String, MongoTemplate> tenantTemplates = new HashMap<>();

    @Value("${assist-server-address1}")
    public String ServerAddress1;

    @Value("${spring.data.mongodb.username}")
    public String ServerUsername;

    @Value("${spring.data.mongodb.password}")
    public String ServerPassword;

    @Value("${spring.data.mongodb.database}")
    public String ServerDbName;

    @Value("${assist-current-environment}")
    public String currentEnv;

    @Autowired
    public MongoDBCredentials mongoDBCredentials;

    @Autowired
    WorkspacesRepository workspaceRepository;

    private static final Logger LOG = LoggerFactory.getLogger(MongoTenantTemplate.class);
    Marker marker;

    public MongoTenantTemplate(MongoDbFactory mongoDbFactory) {
        super(mongoDbFactory);
        tenantTemplates.put(mongoDbFactory.getDb().getName(), new MongoTemplate(mongoDbFactory));
    }

    protected MongoTemplate getTenantMongoTemplate(String tenant) {

        MongoTemplate mongoTemplate = tenantTemplates.computeIfAbsent(tenant, k -> null);
        LOG.info(marker, "Tenant is (MongoDBCredentials) : {}",tenant);
        try {
            if (mongoTemplate == null) {
                MongoCredential mongoCredential;
                // Username,databaseName,password
                if (tenant == ServerDbName) {
                    mongoCredential = MongoCredential.createCredential(ServerUsername, ServerDbName,
                            ServerPassword.toCharArray());
                } else {
                    Workspaces workspace = workspaceRepository.findByDbName(tenant);
                    mongoCredential = MongoCredential.createCredential(workspace.getDbUserName(), workspace.getDbName(),
                            workspace.getDbPassword().toCharArray());
                }
                ServerAddress address1 = new ServerAddress(ServerAddress1, port);

                List<ServerAddress> serverAddressList = new ArrayList<>();
                serverAddressList.add(address1);

                SimpleMongoDbFactory mongoDbFactory = new SimpleMongoDbFactory(
                        new MongoClient(serverAddressList, Arrays.asList(mongoCredential)), tenant);

                mongoTemplate = new MongoTemplate(mongoDbFactory);
            }
            else {
            }
        } catch (Exception e) {
            tenantTemplates.remove(tenant);
        }
        return mongoTemplate;
    }
...

In the above logs,there is an error in authentication to Assist_Refactored_Code_DB(This database is not created by me).Im not sure why this authentication is failing and in which mongo URI the username and password should be mentioned.And Im also not sure whether this is one of the reason for stalling or not. This is the only error logs that I could find in mongos.All other logs in config server and shard mongo doesnt have any errors.

I expect the sharded mongodb to not stall at any point of time and work similar to standalone mongodb.

Can anyone guide me to resolve the stalling of sharded mongodb issue?

Best Answer

Your mongo logs throws authentication error:

2019-08-05T05:27:54.279+0000 I ACCESS   [conn308809] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.:7174 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found

Please check if the user "Assist_Refactored_Code_DB@Refactored_Code_DB" is created in the users collections first

Thank you!

-Anban