Returning Indices of a Subsetted Table
In some situations you might want to build a table and associate index numbers with each row in the result. These index numbers can be useful, particularly when the result of a query generates a very large table that requires a large amount of memory to store. One way to save memory in such a situation is to create a query statement that generates a result containing only the indices of the rows that you are interested in. Then, a print statement allows you to print the rows of interest without first storing them in a variable, which, in some cases, might be too large to hold in memory along with the original table.
The following example demonstrates this technique. In this example, a new table is built from phone_data with an extra column called INDEX. This extra column is simply a 1D array of integers in the range {0...14} created with the INDGEN function.
INDEX = INDGEN(15)
Now, a new table is created from the original table of telephone data, with INDEX included as one of the table’s columns.
newtbl = BUILD_TABLE('INDEX, EXT, DUR, COST')
Next, this new table can be subsetted with QUERY_TABLE so that the result contains only the indices of the rows in which you are interested. Because the resulting table contains only the indices of the desired rows, much less memory is required to store the result than if all of the data in the desired rows were stored.
result = QUERY_TABLE(newtbl, 'INDEX Where COST > .50')
Finally, the following statements perform a more meaningful sort, where the indices stored in result are used to locate the desired rows in the newtbl table.
FOR i=0, N_ELEMENTS(result) - 1 DO BEGIN $
PRINT, newtbl(result(i).index)
note | This method of subsetting tables based on row indices does not work if a Group By clause is used in the QUERY_TABLE command. The reason for this is that Group By clauses typically return the results of calculations, and these results usually have no counterpart in the original table. |