Lists, byte arrays, sets and maps

The following classes are included in the core SDK as part of the collections.toit module:

List provides an ordered and indexed collection which may contain duplicates. This is roughly analogous to extendable arrays in other languages.

ByteArray is similar to a list for integer elements in the range 0 to 255. Byte arrays cannot change size, but see Buffer which has a write method and can be extended.

Set provides a collection of unique objects. When iterating over a Set, the objects are processed in insertion order.

Map provides a data structure mapping keys to values. Like Set, iteration is in insertion order. Overwriting the value associated with a key does not change the order of the key-value pair.

The Set and Map implementations use hashing to do their operations in constant speed. As a consequence their keys must have a hash-code member.

There are two concrete implementations of Set and Map: Set/Map use the overwritable == operator to determine whether two keys are equal, whereas IdentitySet/IdentityMap use identical.

See some examples of use of List and Map in the section Toit blocks.

See also the documentation for the List, Set, and Map classes.

The string and ByteArray classes are not subtypes of Collection, but have some similarities to collections.

Set versus List versus Map

  • Duplicates: A big difference between List and Set class is that List allows duplicates while Set doesn't allow duplicates. Map holds two objects per entry (a key and a value). It may contain duplicate values but keys are always unique.

  • Removing and finding entries: A Set has efficient implementations of the contains and the remove methods, whereas a List must perform a linear search or copy operation that depends on the size of the list.

  • Null elements: List allows null elements. It is possible to have many null objects in a List because duplicates are allowed. Set does not allow null elements, and Map does not allow null keys.

  • Indexing: Lists are designed to allow access to any element by its index. Sets and maps don't have this ability. Although sets and maps are both insertion-ordered it is not possible to access an arbitrary entry in them with its insertion order. The entries must be accessed by lookup or by iterating over all entries. However, Set and Map both have methods first and last that are useful when using them as deduplicating queues or FIFOs.

  • Literals: Lists, ByteArrays, Sets and Maps have literals:

list := []
byte-array := #[]
set := {}
map := {:}

These are the preferred short cuts for:

list := List
byte-array := ByteArray 0
set := Set
map := Map

Literals may also contain values:

list-with-values := [1, 2, 3]
byte-array-with-values := #[1, 2, 3]
set-with-values := {1, 2}
map-with-values := {
  "key": 499,
  "key2": 42,
}

Both Set and Map use hash values to place elements into the underlying data structure. This means that all keys must implement the hash-code method. The performance of sets and maps is best if two different keys have a different hash code (with high probability).

Sets and Maps only use the hash-code function if their size is big enough. Objects that don't implement hash-code might thus not yield errors until the collections grow big enough. where the chance of two different objects having the same hash code should be low.

Examples of using Map

Lookup in a map is done with the indexing operator, []:

  score := scores[team]
  print "The score for $team is $score"

The index operator syntax is understood by the string interpolation syntax, so you can simply write:

  print "The score for $team is $scores[team]"

Updating or adding to a map is done with the []= operator:

graduate name/string:
  title-map[name] = "PhD"

The index operator, [] requires the key to be present in the map, otherwise it throws an exception. As an alternative, you can use the get method which by default returns null for a missing key:

class Scoring:
  // scorers is a map where the key is a team and the value
  // is a set of those that have scored goals for the team.
  scorers /Map ::= {:}

  register-goal team/string new-scorer/string -> none:
    goal-scorers /Set? := scorers.get team
    if goal-scorers == null:
      // First time the team scores, there is no set of
      // goal scorers in the map.  We insert an empty set
      // into which we can insert the new goal scorer.
      goal-scorers = {}  // Empty set.
      scorers[team] = goal-scorers
    goal-scorers.add new-scorer

Often there are more succinct ways to achieve the same with variations of the get method. The above could be written:

  register-goal team/string new-scorer/string -> none:
    // Get the set of goal-scorers for this team,
    // initializing with the empty set if necessary.
    goal-scorers /Set := scorers.get team --init=: {}
    goal-scorers.add new-scorer

The full set of map methods is available in the class documentation. See also the guide to using named block parameters

Functional methods

Toit contains a full set of the collection methods like map and reduce which are known from functional languages. Often you can use these instead of explicitly iterating over the collection.

Instead of:

report-odd collection/List:
  found := false
  collection.do:
    if it % 2 == 1:
      found = true
  if found:
    print "The collection contains at least one odd number"

You can use the any method, which also has the advantage that it stops when it has found a object in the collection which fulfills the predicate:

  if (collection.any: it % 2 == 1):
    print "The collection contains at least one odd number"

(In this case we added parentheses around the if-condition so the colon that introduces the block argument to any does not get mistaken for the colon of the if-statement. The any method is often called some in functional languages.)

Both of these examples made use of the automatic block argument, it.

Similarly, Toit collections have the every method known from functional languages:

  if (collection.every: it % 2 == 1):
    print "The collection contains only odd numbers"

The reduce method:

(The example shows the use of a block with two named arguments).

  sum := collection.reduce: | a b | a + b

This only works if the collection has at least one element and that element is a good starting point for the reduction operation. Otherwise you need to add the optional --initial argument. For example in the following example the elements of the collection are integers, but the result of the reduce operation has a different type, float.

sum-of-roots collection/Collection -> float:
  return collection.reduce --initial=0.0: | sum-so-far integer |
    sum-so-far + integer.sqrt

The version of reduce with an initial value is called fold in functional languages.

Similarly the List and Set classes have map:

// Creates a comma-separated list of quoted strings.
csv list/List -> string:
  Q ::= "\""  // A double quote.
  if (list.any: it.contains Q):
    throw "Can't handle strings containing quotes"
  quoted-list := list.map: "$Q$it$Q"
  return quoted-list.join ","

A more advanced version would handle strings containing quotes:

// Creates a comma-separated list of quoted strings,
// escaping quotes.
csv list/List -> string:
  Q ::= "\""  // A double quote.
  escaped-list := list.map:
    str := "$it"
    if str.contains Q:
      // In CSV quotes are escaped as two quotes.
      str.replace --all Q "$Q$Q"
    else:
      str
  quoted-list := escaped-list.map: "$Q$it$Q"
  return quoted-list.join ","

That example also illustrates that the value of the block (returned to the map method) is the result of the last statement. In this case one of the branches of the if will be the last statement, producing the return value. An alternative way to do this would be to use continue.map

We could also reduce it to an almost unreadable one-liner by putting an arbitrary expression in a string interpolation, surrounding it with $( and ):

// Creates a comma-separated list of quoted strings,
// escaping quotes.
csv list/List -> string:
  return (list.map: "\"$("$it".replace --all "\"" "\"\"")\"").join ","

Sorting

sort is a method of the class List.

To sort a list of numbers from biggest to smallest number and extract the smallest number

main:
  list := [3, 2, 5, 1]
  sorted := list.sort
  print sorted[0]

If the list should be sorted in place, avoiding the allocation of a fresh list

main:
  list := [3, 2, 5, 1]
  list.sort --in-place
  print list.first

To get the highest number of the sorted list

  print list.last

It is also possible to sort with a custom comparison by passing a block to the sort function

main:
  my-list := [1, 2, 5, 9]
  my-list.sort --in-place: | a b | b - a
  print my-list[0]

The block takes two arguments, which are here called a and b and should return (implicitly) a value that can be negative, zero, or positive. Note that b - a sorts in descending order.