Importing Tricks - Importing CSV Files With Special Characters

Link: https://support.brilliantdirectories.com/support/solutions/articles/12000001776-importing-tricks-importing-csv-files-with-special-characters

The database of all Brilliant Directories sites is collated in UTF-8 format (utf8-general-ci). This is the most universal format available today and allows for nearly all special characters in nearly all languages to be stored in the database correctly. 


However, most desktop spreadsheet programs do a very poor job of handling files collated in UTF-8 format (they are a bit behind the times. Mac and Windows for example with some of their office tools).


Thankfully, Google Sheets, the free spreadsheet application offered by Google, natively handles data in UTF-8 format, making it a good tool to use for creating import files that contain special characters.


How To Format The Import File In UTF-8 Using Google Sheets

  1. Create a new spreadsheet in Google Sheets
  2. Click on File >> Import
  3. Find the existing CSV file already uploaded to Google Drive, or upload it through this dialog by selecting the "Upload" option
  4. Click on "Import"
  5. Click on "Open now" once complete


Now we need to download this data as a CSV file. This CSV file will be natively encoded in UTF-8 format.  


To download the CSV file, click on File >> Download as >> Comma-separated values.



 

The file that has been downloaded is a CSV file with the correct UTF-8 format.  If this file is opened in Excel or a similar spreadsheet program, the values may look different - this is because desktop spreadsheet programs do not handle UTF-8 files well as mentioned above.  

 

DO NOT SAVE THE FILE AFTER DOWNLOADING IT!  If the file is saved after opening it in a desktop spreadsheet program, then it will likely change the formatting from UTF-8 again.


Next, import the file to the site as normal, and all of the special characters should remain intact.


Google Sheets Limitations


The only downside to using Google Sheets is that it has a limitation on the amount of data a single sheet can handle at a time (currently 2 million cells per spreadsheet).


For large files that contain more than 2 million cells of data, they just have to be broken down into smaller files to be processed correctly.


For example, if the import file has 8 columns, then Google Sheets can handle 250,000 rows in each individual spreadsheet ( 8 columns x 250,000 rows = 2,000,000 cells ).