Data preparation in Microsoft Excel

Before loading a Microsoft Excel file into an analysis, it is important that the data spreadsheet is free from irrelevant information and has a good structure to prevent misinterpretation. Possible actions that can be done before loading data are removing contextual information and combining columns into one.

The tabular format of the data in an Excel spreadsheet will be represented as a data table in your analysis. The first row with data in the spreadsheet will be interpreted as names of the data columns in the table, and the following rows will be interpreted as data rows.

Remove contextual information

The following illustration shows a spreadsheet containing some contextual information above the actual data table. This will cause misinterpretation of the data.

Please remove any contextual information before loading the data. In the sheet below, there is no contextual information before the actual data set, so it will therefore be interpreted correctly.



Combine columns

The following Excel spreadsheet has many similar columns. They contain the numbers of sold entrance tickets for five different desks.
Combining columns

It is easier to visualize the total number of tickets sold when the sales data is combined into one column. Displayed in the following table, the values from the desks have been combined in one column.