Mysql – Why does updating our database take so long

MySQLupsert

I currently have a small app take in a CSV and call a procedure to update/insert records to keep our MySQL 5.6.39 database up to date with the database of an external web app we use (that we don't have direct access or control over). We update weekly and right now it can take most of a day to finish (~60,000 records), so I have to speed things up somehow.

This is one of the procedure's we use:

DELIMITER //

CREATE PROCEDURE `upsert-debtors` (
    internalCaseIDParam VARCHAR(50),
    caseNumberParam VARCHAR(50),
    bankruptcyCaseNumberParam VARCHAR(50),
    bankruptcyChapterNumberParam VARCHAR(50),
    bankruptcyDateFiledParam VARCHAR(50),
    bankruptcyDischargeDateParam VARCHAR(50),
    bankruptcyDismissalDateParam VARCHAR(50),
    accountReceivedDateParam VARCHAR(50),
    clientCommissionRateParam VARCHAR(50),
    clientFullNameParam VARCHAR(50),
    clientNameParam VARCHAR(50),
    collectionStatusDateParam VARCHAR(50),
    collectorUserNameParam VARCHAR(50),
    creditorParam VARCHAR(50),
    currentBalanceDueParam VARCHAR(50),
    currentClaimStatusParam VARCHAR(50),
    currentCostsParam VARCHAR(50),
    currentFeesParam VARCHAR(50),
    currentInterestParam VARCHAR(50),
    currentPerDiemParam VARCHAR(50),
    currentPrincipalParam VARCHAR(50),
    dateEnteredInSimplicityParam VARCHAR(50),
    debtorAkaParam VARCHAR(50),
    debtorAddressOneParam VARCHAR(50),
    debtorAddressTwoParam VARCHAR(50),
    debtorAddressStatusParam VARCHAR(50),
    debtorCellParam VARCHAR(50),
    debtorCityParam VARCHAR(50),
    debtorCompanyContactParam VARCHAR(50),
    debtorCompanyNameParam VARCHAR(50),
    debtorDobParam VARCHAR(50),
    debtorEmailParam VARCHAR(50),
    debtorFaxParam VARCHAR(50),
    debtorFirstNameParam VARCHAR(50),
    debtorFullNameParam VARCHAR(50),
    debtorLastNameParam VARCHAR(50),
    debtorLinkedBalanceParam VARCHAR(50),
    debtorMiddleNameParam VARCHAR(50),
    debtorOtherPhoneParam VARCHAR(50),
    debtorPhoneParam VARCHAR(50),
    debtorPrefixParam VARCHAR(50),
    debtorSSNParam VARCHAR(50),
    debtorStateParam VARCHAR(50),
    debtorSuffixParam VARCHAR(50),
    debtorZipParam VARCHAR(50),
    docketNumberParam VARCHAR(50),
    employerNameParam VARCHAR(50),
    employerPhoneParam VARCHAR(50),
    firstDelinquencyDateParam VARCHAR(50),
    hotZipParam VARCHAR(50),
    importNotesParam VARCHAR(250),
    lastAccessDateParam VARCHAR(50),
    lastDateReportedToEquifaxParam VARCHAR(50),
    lastDateReportedToExperianParam VARCHAR(50),
    lastDateReportedToTransunionParam VARCHAR(50),
    lastPaymentAmountParam VARCHAR(50),
    lastPaymentDateParam VARCHAR(50),
    lastPaymentNettedAmountParam VARCHAR(50),
    lastWorkDateParam VARCHAR(50),
    neverReportToCreditBureausParam VARCHAR(50),
    nextWorkDateParam VARCHAR(50),
    oldSysAcctSParam VARCHAR(50),
    isClosedParam VARCHAR(50),
    originalClaimAmountParam VARCHAR(50),
    originalClaimInterestRateParam VARCHAR(50),
    originatedDateParam VARCHAR(50),
    patientParam VARCHAR(50),
    patientDobParam VARCHAR(50),
    patientSsnParam VARCHAR(50),
    payerParam VARCHAR(50),
    paymentPlanAmount1Param VARCHAR(50),
    paymentPlanAmount2Param VARCHAR(50),
    paymentPlanAmount3Param VARCHAR(50),
    paymentPlanAmount4Param VARCHAR(50),
    paymentPlanAmount5Param VARCHAR(50),
    paymentPlanAmount6Param VARCHAR(50),
    nextPaymentPlanDueAmountParam VARCHAR(50),
    paymentPlanDate1Param VARCHAR(50),
    paymentPlanDate2Param VARCHAR(50),
    paymentPlanDate3Param VARCHAR(50),
    paymentPlanDate4Param VARCHAR(50),
    paymentPlanDate5Param VARCHAR(50),
    paymentPlanDate6Param VARCHAR(50),
    nextPaymentPlanDueDateParam VARCHAR(50),
    pmtPlanOverdueParam VARCHAR(50),
    physicianParam VARCHAR(50),
    pifCatalystParam VARCHAR(50),
    referringAttorneyNameParam VARCHAR(50),
    totalAttorneyFeesParam VARCHAR(50),
    totalClaimAmountParam VARCHAR(50),
    totalCostsParam VARCHAR(50),
    totalInterestParam VARCHAR(50),
    totalPaymentsParam VARCHAR(50),
    totalPaymentsPreAndPostParam VARCHAR(50),
    workflowBandParam VARCHAR(50),
    clientClaimNumberParam VARCHAR(50)
)
BEGIN
    UPDATE
        debtors
    SET
        case_number = caseNumberParam,
        bankruptcy_case_number = bankruptcyCaseNumberParam,
        bankruptcy_chapter_number = bankruptcyChapterNumberParam,
        bankruptcy_date_filed = bankruptcyDateFiledParam,
        bankruptcy_discharge_date = bankruptcyDischargeDateParam,
        bankruptcy_dismissal_date = bankruptcyDismissalDateParam,
        account_received_date = accountReceivedDateParam,
        client_commission_rate = clientCommissionRateParam,
        client_full_name = clientFullNameParam,
        client_name = clientNameParam,
        collection_status_date = collectionStatusDateParam,
        collector_user_name = collectorUserNameParam,
        creditor = creditorParam,
        current_balance_due = currentBalanceDueParam,
        current_claim_status = currentClaimStatusParam,
        current_costs = currentCostsParam,
        current_fees = currentFeesParam,
        current_interest = currentInterestParam,
        current_per_diem = currentPerDiemParam,
        current_principal = currentPrincipalParam,
        date_entered_in_simplicity = dateEnteredInSimplicityParam,
        debtor_aka = debtorAkaParam,
        debtor_address_one = debtorAddressOneParam,
        debtor_address_two = debtorAddressTwoParam,
        debtor_address_status = debtorAddressStatusParam,
        debtor_cell = debtorCellParam,
        debtor_city = debtorCityParam,
        debtor_company_contact = debtorCompanyContactParam,
        debtor_company_name = debtorCompanyNameParam,
        debtor_dob = debtorDobParam,
        debtor_email = debtorEmailParam,
        debtor_fax = debtorFaxParam,
        debtor_first_name = debtorFirstNameParam,
        debtor_full_name = debtorFullNameParam,
        debtor_last_name = debtorLastNameParam,
        debtor_linked_balance = debtorLinkedBalanceParam,
        debtor_middle_name = debtorMiddleNameParam,
        debtor_other_phone = debtorOtherPhoneParam,
        debtor_phone = debtorPhoneParam,
        debtor_prefix = debtorPrefixParam,
        debtor_s_s_n = debtorSSNParam,
        debtor_state = debtorStateParam,
        debtor_suffix = debtorSuffixParam,
        debtor_zip = debtorZipParam,
        docket_number = docketNumberParam,
        employer_name = employerNameParam,
        employer_phone = employerPhoneParam,
        first_delinquency_date = firstDelinquencyDateParam,
        hot_zip = hotZipParam,
        import_notes = importNotesParam,
        last_access_date = lastAccessDateParam,
        last_date_reported_to_equifax = lastDateReportedToEquifaxParam,
        last_date_reported_to_experian = lastDateReportedToExperianParam,
        last_date_reported_to_transunion = lastDateReportedToTransunionParam,
        last_payment_amount = lastPaymentAmountParam,
        last_payment_date = lastPaymentDateParam,
        last_payment_netted_amount = lastPaymentNettedAmountParam,
        last_work_date = lastWorkDateParam,
        never_report_to_credit_bureaus = neverReportToCreditBureausParam,
        next_work_date = nextWorkDateParam,
        old_sys_acct_s = oldSysAcctSParam,
        is_closed = isClosedParam,
        original_claim_amount = originalClaimAmountParam,
        original_claim_interest_rate = originalClaimInterestRateParam,
        originated_date = originatedDateParam,
        patient = patientParam,
        patient_dob = patientDobParam,
        patient_ssn = patientSsnParam,
        payer = payerParam,
        payment_plan_amount_1 = paymentPlanAmount1Param,
        payment_plan_amount_2 = paymentPlanAmount2Param,
        payment_plan_amount_3 = paymentPlanAmount3Param,
        payment_plan_amount_4 = paymentPlanAmount4Param,
        payment_plan_amount_5 = paymentPlanAmount5Param,
        payment_plan_amount_6 = paymentPlanAmount6Param,
        next_payment_plan_due_amount = nextPaymentPlanDueAmountParam,
        payment_plan_date_1 = paymentPlanDate1Param,
        payment_plan_date_2 = paymentPlanDate2Param,
        payment_plan_date_3 = paymentPlanDate3Param,
        payment_plan_date_4 = paymentPlanDate4Param,
        payment_plan_date_5 = paymentPlanDate5Param,
        payment_plan_date_6 = paymentPlanDate6Param,
        next_payment_plan_due_date = nextPaymentPlanDueDateParam,
        pmt_plan_overdue = pmtPlanOverdueParam,
        physician = physicianParam,
        pif_catalyst = pifCatalystParam,
        referring_attorney_name = referringAttorneyNameParam,
        total_attorney_fees = totalAttorneyFeesParam,
        total_claim_amount = totalClaimAmountParam,
        total_costs = totalCostsParam,
        total_interest = totalInterestParam,
        total_payments = totalPaymentsParam,
        total_payments_pre_and_post = totalPaymentsPreAndPostParam,
        workflow_band = workflowBandParam,
        client_claim_number = clientClaimNumberParam
    WHERE internal_case_id = internalCaseIDParam;

    IF ROW_COUNT() = 0 THEN
        INSERT INTO
          debtors (
            internal_case_id,
            case_number,
            bankruptcy_case_number,
            bankruptcy_chapter_number,
            bankruptcy_date_filed,
            bankruptcy_discharge_date,
            bankruptcy_dismissal_date,
            account_received_date,
            client_commission_rate,
            client_full_name,
            client_name,
            collection_status_date,
            collector_user_name,
            creditor,
            current_balance_due,
            current_claim_status,
            current_costs,
            current_fees,
            current_interest,
            current_per_diem,
            current_principal,
            date_entered_in_simplicity,
            debtor_aka,
            debtor_address_one,
            debtor_address_two,
            debtor_address_status,
            debtor_cell,
            debtor_city,
            debtor_company_contact,
            debtor_company_name,
            debtor_dob,
            debtor_email,
            debtor_fax,
            debtor_first_name,
            debtor_full_name,
            debtor_last_name,
            debtor_linked_balance,
            debtor_middle_name,
            debtor_other_phone,
            debtor_phone,
            debtor_prefix,
            debtor_s_s_n,
            debtor_state,
            debtor_suffix,
            debtor_zip,
            docket_number,
            employer_name,
            employer_phone,
            first_delinquency_date,
            hot_zip,
            import_notes,
            last_access_date,
            last_date_reported_to_equifax,
            last_date_reported_to_experian,
            last_date_reported_to_transunion,
            last_payment_amount,
            last_payment_date,
            last_payment_netted_amount,
            last_work_date,
            never_report_to_credit_bureaus,
            next_work_date,
            old_sys_acct_s,
            is_closed,
            original_claim_amount,
            original_claim_interest_rate,
            originated_date,
            patient,
            patient_dob,
            patient_ssn,
            payer,
            payment_plan_amount_1,
            payment_plan_amount_2,
            payment_plan_amount_3,
            payment_plan_amount_4,
            payment_plan_amount_5,
            payment_plan_amount_6,
            next_payment_plan_due_amount,
            payment_plan_date_1,
            payment_plan_date_2,
            payment_plan_date_3,
            payment_plan_date_4,
            payment_plan_date_5,
            payment_plan_date_6,
            next_payment_plan_due_date,
            pmt_plan_overdue,
            physician,
            pif_catalyst,
            referring_attorney_name,
            total_attorney_fees,
            total_claim_amount,
            total_costs,
            total_interest,
            total_payments,
            total_payments_pre_and_post,
            workflow_band,
            client_claim_number
          )
        VALUES
          (         
            internalCaseIDParam,
            caseNumberParam,
            bankruptcyCaseNumberParam,
            bankruptcyChapterNumberParam,
            bankruptcyDateFiledParam,
            bankruptcyDischargeDateParam,
            bankruptcyDismissalDateParam,
            accountReceivedDateParam,
            clientCommissionRateParam,
            clientFullNameParam,
            clientNameParam,
            collectionStatusDateParam,
            collectorUserNameParam,
            creditorParam,
            currentBalanceDueParam,
            currentClaimStatusParam,
            currentCostsParam,
            currentFeesParam,
            currentInterestParam,
            currentPerDiemParam,
            currentPrincipalParam,
            dateEnteredInSimplicityParam,
            debtorAkaParam,
            debtorAddressOneParam,
            debtorAddressTwoParam,
            debtorAddressStatusParam,
            debtorCellParam,
            debtorCityParam,
            debtorCompanyContactParam,
            debtorCompanyNameParam,
            debtorDobParam,
            debtorEmailParam,
            debtorFaxParam,
            debtorFirstNameParam,
            debtorFullNameParam,
            debtorLastNameParam,
            debtorLinkedBalanceParam,
            debtorMiddleNameParam,
            debtorOtherPhoneParam,
            debtorPhoneParam,
            debtorPrefixParam,
            debtorSSNParam,
            debtorStateParam,
            debtorSuffixParam,
            debtorZipParam,
            docketNumberParam,
            employerNameParam,
            employerPhoneParam,
            firstDelinquencyDateParam,
            hotZipParam,
            importNotesParam,
            lastAccessDateParam,
            lastDateReportedToEquifaxParam,
            lastDateReportedToExperianParam,
            lastDateReportedToTransunionParam,
            lastPaymentAmountParam,
            lastPaymentDateParam,
            lastPaymentNettedAmountParam,
            lastWorkDateParam,
            neverReportToCreditBureausParam,
            nextWorkDateParam,
            oldSysAcctSParam,
            isClosedParam,
            originalClaimAmountParam,
            originalClaimInterestRateParam,
            originatedDateParam,
            patientParam,
            patientDobParam,
            patientSsnParam,
            payerParam,
            paymentPlanAmount1Param,
            paymentPlanAmount2Param,
            paymentPlanAmount3Param,
            paymentPlanAmount4Param,
            paymentPlanAmount5Param,
            paymentPlanAmount6Param,
            nextPaymentPlanDueAmountParam,
            paymentPlanDate1Param,
            paymentPlanDate2Param,
            paymentPlanDate3Param,
            paymentPlanDate4Param,
            paymentPlanDate5Param,
            paymentPlanDate6Param,
            nextPaymentPlanDueDateParam,
            pmtPlanOverdueParam,
            physicianParam,
            pifCatalystParam,
            referringAttorneyNameParam,
            totalAttorneyFeesParam,
            totalClaimAmountParam,
            totalCostsParam,
            totalInterestParam,
            totalPaymentsParam,
            totalPaymentsPreAndPostParam,
            workflowBandParam,
            clientClaimNumberParam
          );
    END IF;
END //

DELIMITER ;

Is there anything as far as the procedure is set up or possibly some database settings that would cause it to be as slow as it is?

Thank you!

Best Answer

Databases are good at doing things en masse; they are slow at doing things one at a time.

  1. LOAD DATA (or otherwise load the data) into a temporary table
  2. Use IODKU:

    INSERT INTO debtors (...)
        SELECT ...
            FROM temp_table
        ON DUPLICATE KEY UPDATE ...;
    

To use IODKU, there needs to be a UNIQUE (or PRIMARY) key so that it knows which row to test for the existence of.

Be sure to use VALUES(col) in the UPDATE part. (See the documentation.)

You should seriously consider splitting that very-wide table into a few tables such as bankruptcy, debtor, last_payment, etc. In doing so, you will quickly discover that you often don't need to update all hundred columns.

Also see what might reasonably be 'normalized'. Both of these will make the IODKU step more complex.

payment_plan seems to have a limit of 6. And probably has most rows empty. It would be better to have the 6 as rows, not columns. A particular debtor would have 0 or more rows, not limited to 6.

When you are running the update, does it interfere with other activities? If so, chunk it into groups of 100 rows at a time.