Another (and better) method for importing Excel tabular data into MySQL

authored by Frank Lynam at 26/08/2013 13:14:12

About a week ago I wrote a post about how to go about importing Excel data into a MySQL database. In it I described how you would create an OpenOffice spreadsheet ODS file and then use the ‘Import’ feature of phpMyAdmin to do the conversion of this file into SQL commands that would be run behind the scenes to import the data. I thought that that was the solution to the problem but, unfortunately, I was wrong.

Having imported an Excel worksheet that had around 2,500 rows, I checked the first few row values against the MySQL records that were created and they all seemed to match up. A couple of days later, however, I noticed that one of the entries had its values mixed up. What had been entered into one column in Excel was now appearing in another column in MySQL. I investigated this pretty thoroughly but couldn’t come up with any reason for this to be happening other than there being a bug in the ODS import method in phpMyAdmin. As I’ve got a decent amount of data and don’t have the time to go through all the MySQL rows to check that they matched up with the Excel source file, I decided that I needed to write my own VBA Excel to SQL command converter to be 100% sure that the data going in was correct.

The previous post used a similar method employed here to create SQL commands to create the table’s structure based on the structure of the Excel file. The following script assumes that the table structure is setup correctly. It loops through the Excel worksheet and adds a new entry for each row found.


Public Function PopulateMySQLTable(ByVal szWSName As String, _
                                   ByVal nFieldNamesRowIndex As Integer, _
                                   ByVal szNameColumnName As String, _
                                   ByVal szSQLTableName As String, _
                                   ByVal szSQLFileName As String _
                                   ) As Boolean
    Dim i As Integer
    
    PopulateMySQLTable = False
    
    If ((szWSName = "") Or (szNameColumnName = "") Or (szSQLTableName = "") Or (szSQLFileName = "")) Then
        Exit Function
    End If
    
    Dim nColumnCount As Integer
    nColumnCount = Cells(nFieldNamesRowIndex, Columns.Count).End(xlToLeft).Column
    If (nColumnCount < 1) Then
        Exit Function
    End If
    
    Dim nFileNumber As Integer
    nFileNumber = FreeFile()
    Open szSQLFileName For Output As nFileNumber
    
    Print #nFileNumber, "SET NAMES utf8;"
    
    Dim szLine As String
    szLine = "INSERT INTO " & szSQLTableName & " ("
    Print #nFileNumber, szLine
    szLine = ""
    
    'ADD ALL THE FIELD NAMES
    For i = 1 To nColumnCount
        Dim szColumnLetter As String
        szColumnLetter = GetColumnLetterFromIndex(i)
        If (szColumnLetter <> "") Then
            Dim szFieldName As String
            szFieldName = Range(szColumnLetter & "1").Text
            If (szFieldName <> "") Then
                If (i > 1) Then
                    szLine = ","
                End If
                szLine = (szLine & "`" & szFieldName & "`")
                Print #nFileNumber, szLine
            End If
        End If
    Next
    Print #nFileNumber, ")"
    
    'NOW ADD EACH OF THE ROWS
    Print #nFileNumber, "VALUES"
    Dim bContinue As Boolean
    Dim nRowIndex As Long
    nRowIndex = (nFieldNamesRowIndex + 1)
    bContinue = True
    Do While (bContinue)
        'IS THIS AN EMPTY ROW?
        If (Range(szNameColumnName & nRowIndex).Text = "") Then
            bContinue = False
        Else
            If (nRowIndex > (nFieldNamesRowIndex + 1)) Then
                Print #nFileNumber, ","
            End If
            Print #nFileNumber, "("
            For i = 1 To nColumnCount
                szColumnLetter = GetColumnLetterFromIndex(i)
                If (szColumnLetter <> "") Then
                    Dim szValue As String
                    szValue = Range(szColumnLetter & nRowIndex).Text
                    If (szValue <> "") Then
                        szValue = Replace(szValue, vbCrLf, " ")
                        szValue = Replace(szValue, Chr(10), " ")
                        szValue = Replace(szValue, Chr(13), " ")
                        szValue = Replace(szValue, "`", " ")
                        szValue = Replace(szValue, "'", " ")
                        szLine = ("'" & szValue & "'")
                    Else
                        szLine = ("NULL")
                    End If
                    If (i < nColumnCount) Then
                        szLine = (szLine & ",")
                    End If
                    Print #nFileNumber, szLine
                End If
            Next
            Print #nFileNumber, ")"
            nRowIndex = (nRowIndex + 1)
        End If
    Loop
    Print #nFileNumber, ";"
    
    Close #nFileNumber
    
    PopulateMySQLTable = True
End Function

The function takes 5 arguments. The first is the name of the source worksheet. The second is the row index where the column headers appear (usually ‘1’). The third is the column name for a value that must be set for each row to be valid. The algorithm uses this to check if it should continue searching for more rows or not. The final two arguments specify the name of the SQL table in the form ‘database_name’.’table_name’ and the path to the .SQL file that will be created to house the SQL commands.

Run this VBA macro and then use phpMyAdmin to import the outputted .SQL file in order to populate your table.

One final thing to note is that I needed to change a few things in the structure of the table in order to allow some of the text contained in the Excel fields to be entered into MySQL using this method. Essentially, I needed to allow the text fields to support Unicode character-sets that were present. In order to do this, I needed to make all the MySQL text fields to be of type TEXT and for them to have a collation value of ‘utf8_unicode_ci’. The final change required was to change the collation of the table itself to be ‘utf8_unicode_ci’.

Comments

submit