Basic Mapping

Binder

Tags

In the previous tutorial, we used HTMap like this:

[1]:
import htmap

def double(x):
    return 2 * x
[2]:
mapped = htmap.map(double, range(10))
print(mapped)
doubled = list(mapped)
print(doubled)
Created map dark-puny-robe with 10 components
Map(tag = dark-puny-robe)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In particular, we used the htmap.map function to create our map. This function creates an object that behaves a lot like the iterator returned by the built-in map function. To get our output, we iterated over it using list.

You may have noticed that the map has a tag associated with it. HTMap generated this tag for us because we didn’t provide one, and because we didn’t provide one, marked the map as transient, as opposed to persistent. Transient maps are for quick tests where we don’t care too much about organization. Persistent maps are for longer-running maps where we want to keep our work organized by giving things real names. If you don’t plan on using your map for more than one session, you can probably get away with a transient map. If you’re going to step away from the computer and come back, we recommend giving it a real tag.

The map we created above is transient:

[3]:
print(mapped.is_transient)
True

To create a persistent map, we need to give our map our map a tag:

[4]:
another_map = htmap.map(double, range(10), tag = 'dbl')
print(another_map)
print(another_map.is_transient)
Created map dbl with 10 components
Map(tag = dbl)
False

We can also “retag” a map to give it a new tag. If you tag a transient map, it becomes persistent.

[5]:
mapped.retag('a-new-tag')
print(mapped)
print(mapped.is_transient)
Map(tag = a-new-tag)
False

Working with Maps

The object that was returned by htmap.map is a htmap.Map. It gives us a window into the map as it is running, and lets us use the output once the map is finished.

For example, we can print the status of the map:

[6]:
stringified = htmap.map(str, range(10), tag = 'str')
print(stringified.status())
Created map str with 10 components
Map str (10 components): HELD = 0 | ERRORED = 0 | IDLE = 10 | RUNNING = 0 | COMPLETED = 0

We can wait for the map to finish:

[7]:
stringified.wait(show_progress_bar = True)
str: 100%|##########| 10/10 [00:09<00:00,  1.11component/s]

There are many ways to iterate over maps:

[8]:
print(list(stringified))

for d in stringified:
    print(d)
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
0
1
2
3
4
5
6
7
8
9

If we ever lose our reference to it, we can grab a new reference to it using htmap.load, giving it the tag of the map we want:

[9]:
new_ref = htmap.load('str')

print(new_ref)
print(new_ref == stringified)
print(new_ref is stringified)  # maps are singletons
Map(tag = str)
True
True

Maps can be recovered from an entirely different Python interpreter session as well. Suppose you close Python and go on vacation. You come back and you want to look at your map again, but you’ve forgotten what you called it. Just ask HTMap for a list of your tags:

[10]:
print(htmap.get_tags())
('dbl', 'str', 'a-new-tag')

Ok, well, technically it was a tuple, but we’ll have to live with it.

HTMap can also print a pretty table showing the status of your maps:

[11]:
htmap.map(str, range(5))  # new transient map
print(htmap.status())
Created map breezy-happy-hand with 5 components
Tag                  HELD  ERRORED  IDLE  RUNNING  COMPLETED  Local Data  Max Memory  Max Runtime  Total Runtime
a-new-tag             0       0      0       0         10      63.9 KB     41.0 MB      0:00:00       0:00:00
dbl                   0       0      0       0         10      63.9 KB     41.0 MB      0:00:00       0:00:00
str                   0       0      0       0         10      63.5 KB     41.0 MB      0:00:00       0:00:00
* breezy-happy-hand   0       0      5       0         0       19.8 KB      0.0 B       0:00:00       0:00:00

Note that transient maps have a * in front of their tags.

The status message tells us about how many components of our map are in each of the five most common component states:

  • Idle - component is waiting to run

  • Running - component is currently executing remotely

  • Completed - component is finished executing and output is available

  • Held - HTCondor has noticed a problem with the component and is not letting it run

  • Errored - there was an error in your code, and HTMap has brought back error information

The status of each component of your map is available using the map attribute component_statuses:

[12]:
print(new_ref.component_statuses)
[<ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>, <ComponentStatus.COMPLETED: 'COMPLETED'>]

We’ll discuss what to do about held and errored components and how to interact with component statuses in the Error Handling tutorial.

Tags are unique: if we try to create another map with a tag we’ve already used, it will fail:

[13]:
new_map = htmap.map(double, range(10), tag = 'dbl')
---------------------------------------------------------------------------
TagAlreadyExists                          Traceback (most recent call last)
<ipython-input-13-397c48e54a47> in <module>
----> 1 new_map = htmap.map(double, range(10), tag = 'dbl')

~/htmap/htmap/mapping.py in map(func, args, map_options, tag)
     86         func,
     87         args_and_kwargs,
---> 88         map_options = map_options,
     89     )
     90

~/htmap/htmap/mapping.py in create_map(tag, func, args_and_kwargs, map_options)
    276
    277     tags.raise_if_tag_is_invalid(tag)
--> 278     tags.raise_if_tag_already_exists(tag)
    279
    280     logger.debug(f'Creating map {tag} ...')

~/htmap/htmap/tags.py in raise_if_tag_already_exists(tag)
     59     """Raise a :class:`htmap.exceptions.TagAlreadyExists` if the ``tag`` already exists."""
     60     if tag_file_path(tag).exists():
---> 61         raise exceptions.TagAlreadyExists(f'The requested tag "{tag}" already exists. Load the Map with htmap.load("{tag}"), or remove it using htmap.remove("{tag}").')
     62
     63

TagAlreadyExists: The requested tag "dbl" already exists. Load the Map with htmap.load("dbl"), or remove it using htmap.remove("dbl").

As the error message indicates, if we want to re-use the tag dbl, we need to remove the old map first:

[14]:
old_map = htmap.load('dbl')
old_map.remove()

htmap.Map.remove deletes all traces of the map. It can never be recovered. Be careful when using it!

The module-level shortcut htmap.remove lets you skip the intermediate step of getting the actual Map, if you don’t already have it.

Now we can re-use the map ID:

[15]:
new_map = htmap.map(double, range(10), tag = 'dbl')
new_map.wait(show_progress_bar = True)
print(list(new_map))
dbl:   0%|          | 0/10 [00:00<?, ?component/s]
Created map dbl with 10 components
dbl: 100%|##########| 10/10 [00:07<00:00,  1.42component/s]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Map Builders

So far we’ve been avoiding any functions that needed to be mapped over keyword arguments, or that had more than one positional argument. htmap.map is not really the ideal tool for working with functions that have more than one argument, and it does not support varying more than one argument at all.

A much more ergonomic way to build up a complex map is a map builder. A map builder lets you build a map via individual function calls. Call htmap.build_map as a context manager to get the builder, then call the builder as if it were the mapped function itself:

[16]:
def power(base, exponent):
    return base ** exponent

with htmap.build_map(power) as pow_builder:
    for base in range(1, 5):  # bases are 1, 2, 3, 4
        for exponent in range(1, 4):  # exponents are 1, 2, 3
            pow_builder(base, exponent)

powered = pow_builder.map
print(list(powered))  # 1^1, 1^2, 1^3, 2^1, 2^2, 2^3, 3^1 ...
Created map harsh-happy-ring with 12 components
[1, 1, 1, 2, 4, 8, 3, 9, 27, 4, 16, 64]

The map builder catches the function calls and turns them into a map. The map is created when the with block ends, at which point you can grab the actual htmap.Map from the builder’s map attribute.


In the next tutorial, we’ll see how to tell HTMap to bring a local file along to the execute node.