Sampling in MS Excel
Here are the steps that are needed to take a random sample of observations from within Microsoft Excel. Please note that using spreadsheets to manage data is a generally bad idea... these instructions are here for those folks who don't have any other tools.
Assuming that your spreadsheet has one row of column names (in row 1), do this:
- In the first row of the first column to the right of your data, give the sorting column a name, say
Sorter. Do not leave any blank columns between the data and the new column.
- In the second row of the first column to the right of your data, type
=rand(), and hit enter. You need both the
= and the
- Select the cell with the random number in it.
- Copy the formula to all the other rows which have data either by:
- dragging the drag box in the lower right-hand corner (good for small datasets), or
- copying and pasting via
Once all rows have a random number in them,
- Copy the cell.
- Use the scroll bar or page down to get to the bottom of the dataset. Do not use any fancy-schmancy keyboard shortcuts, since you want to leave the selected cell as is.
- Shift-click on the last cell in the new random-number column.
If you think you made a mistake in selecting the data, hit
- Select *all* the data in the spreadsheet, including the first row with the colukn names. (this is critical)
- Go to
Sort by, choose
Sorter. It does not matter if you sort ascending or descending.
- Be sure to check that MS Excel thinks you have a header row, so the column names don't get messed in with the data.
- Click OK
If you think you are all set, and you won't need to sort any more, delete the column with the random numbers in it.
You can now sample by working your way down the rows in the table.
Things to note:
- Every time your spreadsheet changes (and hence does a recalculation), all the random numbers will change. This is normal.
- If you don't select all the data before sorting, you could ruin your dataset. Be careful.