Collapse AllExpand All

2.2. Codes for Missing Data

In addition to simple single-valued NULL substitution described in the previous section, SciDB also supports multi-valued NULLs using the notion of missing reason codes. Missing reason codes allow an application to optionally specify multiple types of NULLs and treat each type differently.

For example, if a faulty instrument occasionally fails to report a reading, that attribute could be represented in a SciDB array as NULL. If an erroneous instrument reports readings that are out of valid bounds for an attribute, that may also be represented as NULL.

NULL must be represented using the token 'null' or '?' in place of the attribute value. In addition, NULL values can be tagged with a "missing reason code" to help a SciDB application distinguish among different types of null values—for example, assigning a unique code to the following types of errors: "instrument error", "cloud cover", or "not enough data for statistically significant result". Or, in the case of financial market data, data may be missing because "market closed", "trading halted", or "data feed down".

The examples below show how to represent missing data in the load file. A question mark (?) or null represent null values, and ?2 represents null value with a reason code of 2.

[[ ( 10, 4.5, "My String", 'C'), (10, 5.1, ?1, 'D'), 
(?2, 5.1, "Another String", ?) ...


[[ ( 10, 4.5, "My String", 'C'), (10, 5.1, ?1, 'D'), 
(?2, 5.1, "Another String", null) ...

Use the substitute operator to substitute different values for each type of NULL. For more information on NULL substitution, see the substitute operator reference.

Additionally, you can specify a missing reason code for the default value of an attribute. For example, the following statement specifies a missing reason code of 30 as the default value for the val3 attribute.

AFL% create array D<val1:char default 'a', val2:int32 default 1, 
     val3:int64 null default missing(30)> [i=0:9,10,0];  
[("D<val1:char DEFAULT 'a',val2:int32 DEFAULT 1,val3:int64 NULL DEFAULT ?30> [i=0:9,10,0]")]