Error Handling

Binder

Holds

In previous tutorials we mentioned that HTMap is able to track the status of your components and inform you about something called a “hold”. A hold occurs when HTCondor notices something wrong about your map component. Perhaps an input file is missing, or your component tried to use a file that didn’t exist.

The last one is easy to force, so let’s do it and see what happens:

[1]:
import htmap

@htmap.mapped
def foo(_):  # _ is a perfectly legal argument name, often used to mean "I don't actually use it"
    return "I didn't get held!"
[2]:
path = htmap.TransferPath('this-file-does-not-exist.txt')
will_get_held = foo.map(
    [path],
)
Created map angry-husky-law with 1 components

We know that the component will fail, but HTMap won’t know about it until we try to look at the output:

[3]:
print(will_get_held.get(0))
---------------------------------------------------------------------------
MapComponentHeld                          Traceback (most recent call last)
<ipython-input-3-68dfbf32680e> in <module>
----> 1 print(will_get_held.get(0))

~/htmap/htmap/maps.py in _protect(self, *args, **kwargs)
     43         if not self.exists:
     44             raise exceptions.MapWasRemoved(f'Cannot call {method} for map {self.tag} because it has been removed')
---> 45         return method(self, *args, **kwargs)
     46
     47     return _protect

~/htmap/htmap/maps.py in get(self, component, timeout)
    390             If ``None``, wait forever.
    391         """
--> 392         return self._load_output(component, timeout = timeout)
    393
    394     def __getitem__(self, item: int) -> Any:

~/htmap/htmap/maps.py in _load_output(self, component, timeout)
    341             raise IndexError(f'Tried to get output for component {component}, but map {self.tag} only has {len(self)} components')
    342
--> 343         self._wait_for_component(component, timeout)
    344
    345         status_and_result = htio.load_objects(self._output_file_path(component))

~/htmap/htmap/maps.py in _wait_for_component(self, component, timeout)
    307                 break
    308             elif component_status is state.ComponentStatus.HELD:
--> 309                 raise exceptions.MapComponentHeld(f'Component {component} of map {self.tag} is held: {self.holds[component]}')
    310
    311             if timeout is not None and (time.time() >= start_time + timeout):

MapComponentHeld: Component 0 of map angry-husky-law is held: [13] Error from slot1_6@1bea834c10a5: SHADOW at 172.17.0.2 failed to send file(s) to <172.17.0.2:33571>: error reading from /home/jovyan/tutorials/this-file-does-not-exist.txt: (errno 2) No such file or directory; STARTER failed to receive file(s) from <172.17.0.2:9618>

Yikes! HTMap has raised an exception to inform us that a component of our map got held. It also tells us why HTCondor held the component: error reading from /home/jovyan/tutorials/this-file-does-not-exist: (errno 2) No such file or directory; STARTER failed to receive file(s) from <172.17.0.2:9618>.

This time around the hold reason is pretty clear: a local file that HTCondor expected to exist didn’t. We could fix the problem by creating the file, and then releasing the map, which tells HTCondor to try again:

[4]:
path.touch()  # this creates an empty file

Now the map will run successfully. We tell HTMap to “release” the hold, allowing the map to continue running.

[5]:
will_get_held.release()
print(will_get_held.get(0))
I didn't get held!

Debugging holds

Unfortunately, holds will often not be so easy to resolve. Sometimes they are simply ephemeral errors that can be resolved by releasing the map without changing anything. But sometimes you’ll need to talk to your HTCondor pool administrator to figure out what’s going wrong.

Sometimes these errors are caused by additional parameters specified in your ~/.htmaprc file. Are you sure ~/.htmaprc has the intended parameters?

If you’re feeling really adventurous, look at files in the directory ~/.htmap/. The standard output and error files are contained within this directory. This might help solve your problem.

Execution Errors

HTMap can also detect Python exceptions that occur during component execution. To see this in action, let’s define a function where a component will have a problem:

[6]:
@htmap.mapped
def inverse(x):
    return 1 / x

When x = 0, inverse(x) will fail with a ZeroDivisionError. If we run it locally, the error will halt execution and drop a traceback into our laps:

[7]:
inverse(0)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-7-7538d73c586c> in <module>
----> 1 inverse(0)

~/htmap/htmap/mapped.py in __call__(self, *args, **kwargs)
     50     def __call__(self, *args, **kwargs):
     51         """Call the function as normal, locally."""
---> 52         return self.func(*args, **kwargs)
     53
     54     def map(

<ipython-input-6-769ac4dfb4b6> in inverse(x)
      1 @htmap.mapped
      2 def inverse(x):
----> 3     return 1 / x

ZeroDivisionError: division by zero

The traceback has a lot of critically-useful information in it. In fact, it tells us exactly the line that raised the error (remember that tracebacks should be read in reverse - the last block of source code is where the error began).

HTMap is able to transport this kind of information back from an executing component, but like the regular output of a map we won’t see it until we try to load up the output for the failed component. We’ll make a one-component map to demonstrate what happens:

[8]:
bad_map = inverse.map([0])
bad_map.get(0)
Created map fair-sly-drone with 1 components
---------------------------------------------------------------------------
MapComponentError                         Traceback (most recent call last)
<ipython-input-8-d23b8117e4db> in <module>
      1 bad_map = inverse.map([0])
----> 2 bad_map.get(0)

~/htmap/htmap/maps.py in _protect(self, *args, **kwargs)
     43         if not self.exists:
     44             raise exceptions.MapWasRemoved(f'Cannot call {method} for map {self.tag} because it has been removed')
---> 45         return method(self, *args, **kwargs)
     46
     47     return _protect

~/htmap/htmap/maps.py in get(self, component, timeout)
    390             If ``None``, wait forever.
    391         """
--> 392         return self._load_output(component, timeout = timeout)
    393
    394     def __getitem__(self, item: int) -> Any:

~/htmap/htmap/maps.py in _load_output(self, component, timeout)
    348             return next(status_and_result)
    349         elif status == 'ERR':
--> 350             raise exceptions.MapComponentError(f'Component {component} of map {self.tag} encountered error while executing. Error report:\n{self._load_error(component).report()}')
    351         else:
    352             raise exceptions.InvalidOutputStatus(f'Output status {status} is not valid')

MapComponentError: Component 0 of map fair-sly-drone encountered error while executing. Error report:
==========  Start error report for component 0 of map fair-sly-drone  ==========
Landed on execute node 1bea834c10a5 (172.17.0.2) at 2020-05-21 17:45:40.954824

Python executable is /opt/conda/bin/python3 (version 3.7.6)
with installed packages
  alembic==1.4.2
  async-generator==1.10
  attrs==19.3.0
  backcall==0.1.0
  bleach==3.1.4
  blinker==1.4
  brotlipy==0.7.0
  certifi==2020.4.5.1
  certipy==0.1.3
  cffi==1.14.0
  chardet==3.0.4
  click==7.1.2
  click-didyoumean==0.0.3
  cloudpickle==1.4.1
  colorama==0.4.3
  conda==4.8.2
  conda-package-handling==1.6.0
  cryptography==2.9.2
  cursor==1.3.4
  decorator==4.4.2
  defusedxml==0.6.0
  entrypoints==0.3
  halo==0.0.29
  htchirp==1.0
  htcondor==8.9.6
  -e git+https://github.com/htcondor/htmap.git@e0fd6de94fcad0295ae674e5479fac51cf57f34f#egg=htmap
  idna==2.9
  importlib-metadata==1.6.0
  ipykernel==5.2.1
  ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1588362967322/work
  ipython-genutils==0.2.0
  jedi==0.17.0
  Jinja2==2.11.2
  json5==0.9.0
  jsonschema==3.2.0
  jupyter-client==6.1.3
  jupyter-core==4.6.3
  jupyter-telemetry==0.0.5
  jupyterhub==1.1.0
  jupyterlab==2.1.1
  jupyterlab-server==1.1.1
  log-symbols==0.0.14
  Mako==1.1.0
  MarkupSafe==1.1.1
  mistune==0.8.4
  nbconvert==5.6.1
  nbformat==5.0.6
  nbstripout==0.3.7
  notebook==6.0.3
  oauthlib==3.0.1
  pamela==1.0.0
  pandocfilters==1.4.2
  parso==0.7.0
  pexpect==4.8.0
  pickleshare==0.7.5
  prometheus-client==0.7.1
  prompt-toolkit==3.0.5
  ptyprocess==0.6.0
  pycosat==0.6.3
  pycparser==2.20
  pycurl==7.43.0.5
  Pygments==2.6.1
  PyJWT==1.7.1
  pyOpenSSL==19.1.0
  pyrsistent==0.16.0
  PySocks==1.7.1
  python-dateutil==2.8.1
  python-editor==1.0.4
  python-json-logger==0.1.11
  pyzmq==19.0.0
  requests==2.23.0
  ruamel-yaml==0.15.80
  ruamel.yaml.clib==0.2.0
  Send2Trash==1.5.0
  six==1.14.0
  spinners==0.0.24
  SQLAlchemy==1.3.16
  termcolor==1.1.0
  terminado==0.8.3
  testpath==0.4.4
  toml==0.10.0
  tornado==6.0.4
  tqdm==4.46.0
  traitlets==4.3.3
  urllib3==1.25.9
  wcwidth==0.1.9
  webencodings==0.5.1
  zipp==3.1.0

Scratch directory contents are
  /home/jovyan/.condor/local/execute/dir_461/.chirp.config
  /home/jovyan/.condor/local/execute/dir_461/_htmap_user_transfer
  /home/jovyan/.condor/local/execute/dir_461/.job.ad
  /home/jovyan/.condor/local/execute/dir_461/_condor_stderr
  /home/jovyan/.condor/local/execute/dir_461/.machine.ad
  /home/jovyan/.condor/local/execute/dir_461/func
  /home/jovyan/.condor/local/execute/dir_461/_condor_stdout
  /home/jovyan/.condor/local/execute/dir_461/0.in
  /home/jovyan/.condor/local/execute/dir_461/_htmap_transfer
  /home/jovyan/.condor/local/execute/dir_461/_htmap_do_output_transfer
  /home/jovyan/.condor/local/execute/dir_461/_htmap_transfer_plugin_cache
  /home/jovyan/.condor/local/execute/dir_461/condor_exec.exe
  /home/jovyan/.condor/local/execute/dir_461/.update.ad

Exception and traceback (most recent call last):
  File "<ipython-input-6-769ac4dfb4b6>", line 3, in inverse
    return 1 / x

    Local variables:
      x = 0

  ZeroDivisionError: division by zero

===========  End error report for component 0 of map fair-sly-drone  ===========

Neat! This traceback is, unfortunately, harder to read than the other one. We need to ignore everything above MapComponentError: component 0 of map <tag> encountered error while executing. Error report: - it’s just about the internal error that HTMap is raising to propagate the error to us. The real error is the stuff below =========  Start error report for component 0 of map <tag>  =========.

Since we’re trying to debug remotely, HTMap has gathered some metadata about the HTCondor “execute node” where the component was running. First it tell us where it is and when the component started executing. Next, the report tells us about the Python environment that was used to execute your function, including a list of installed packages. We also get a listing of the contents of the working directory - in this example, because we didn’t add any extra input files, it’s just a bunch of files that HTCondor and HTMap are using.

The meat of the error is the last thing in the error report. We get roughly the same information that we got in the local traceback, but we also get a printout of the local variables in each stack frame.

Since the local HTMap error is raised as soon as it finds a bad component, you may find it convenient to look at all of the error reports for your map (hopefully not too many!). htmap.Map.error_reports provides exactly this functionality:

[9]:
worse_map = inverse.map([0, 0, 0])
worse_map.wait(errors_ok = True)  # wait for all of the components to hit the error
for report in worse_map.error_reports():
    print(report + '\n')
Created map firm-vast-oven with 3 components
==========  Start error report for component 0 of map firm-vast-oven  ==========
Landed on execute node 1bea834c10a5 (172.17.0.2) at 2020-05-21 17:45:44.454503

Python executable is /opt/conda/bin/python3 (version 3.7.6)
with installed packages
  alembic==1.4.2
  async-generator==1.10
  attrs==19.3.0
  backcall==0.1.0
  bleach==3.1.4
  blinker==1.4
  brotlipy==0.7.0
  certifi==2020.4.5.1
  certipy==0.1.3
  cffi==1.14.0
  chardet==3.0.4
  click==7.1.2
  click-didyoumean==0.0.3
  cloudpickle==1.4.1
  colorama==0.4.3
  conda==4.8.2
  conda-package-handling==1.6.0
  cryptography==2.9.2
  cursor==1.3.4
  decorator==4.4.2
  defusedxml==0.6.0
  entrypoints==0.3
  halo==0.0.29
  htchirp==1.0
  htcondor==8.9.6
  -e git+https://github.com/htcondor/htmap.git@e0fd6de94fcad0295ae674e5479fac51cf57f34f#egg=htmap
  idna==2.9
  importlib-metadata==1.6.0
  ipykernel==5.2.1
  ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1588362967322/work
  ipython-genutils==0.2.0
  jedi==0.17.0
  Jinja2==2.11.2
  json5==0.9.0
  jsonschema==3.2.0
  jupyter-client==6.1.3
  jupyter-core==4.6.3
  jupyter-telemetry==0.0.5
  jupyterhub==1.1.0
  jupyterlab==2.1.1
  jupyterlab-server==1.1.1
  log-symbols==0.0.14
  Mako==1.1.0
  MarkupSafe==1.1.1
  mistune==0.8.4
  nbconvert==5.6.1
  nbformat==5.0.6
  nbstripout==0.3.7
  notebook==6.0.3
  oauthlib==3.0.1
  pamela==1.0.0
  pandocfilters==1.4.2
  parso==0.7.0
  pexpect==4.8.0
  pickleshare==0.7.5
  prometheus-client==0.7.1
  prompt-toolkit==3.0.5
  ptyprocess==0.6.0
  pycosat==0.6.3
  pycparser==2.20
  pycurl==7.43.0.5
  Pygments==2.6.1
  PyJWT==1.7.1
  pyOpenSSL==19.1.0
  pyrsistent==0.16.0
  PySocks==1.7.1
  python-dateutil==2.8.1
  python-editor==1.0.4
  python-json-logger==0.1.11
  pyzmq==19.0.0
  requests==2.23.0
  ruamel-yaml==0.15.80
  ruamel.yaml.clib==0.2.0
  Send2Trash==1.5.0
  six==1.14.0
  spinners==0.0.24
  SQLAlchemy==1.3.16
  termcolor==1.1.0
  terminado==0.8.3
  testpath==0.4.4
  toml==0.10.0
  tornado==6.0.4
  tqdm==4.46.0
  traitlets==4.3.3
  urllib3==1.25.9
  wcwidth==0.1.9
  webencodings==0.5.1
  zipp==3.1.0

Scratch directory contents are
  /home/jovyan/.condor/local/execute/dir_492/.chirp.config
  /home/jovyan/.condor/local/execute/dir_492/_htmap_user_transfer
  /home/jovyan/.condor/local/execute/dir_492/.job.ad
  /home/jovyan/.condor/local/execute/dir_492/_condor_stderr
  /home/jovyan/.condor/local/execute/dir_492/.machine.ad
  /home/jovyan/.condor/local/execute/dir_492/func
  /home/jovyan/.condor/local/execute/dir_492/_condor_stdout
  /home/jovyan/.condor/local/execute/dir_492/0.in
  /home/jovyan/.condor/local/execute/dir_492/_htmap_transfer
  /home/jovyan/.condor/local/execute/dir_492/_htmap_do_output_transfer
  /home/jovyan/.condor/local/execute/dir_492/_htmap_transfer_plugin_cache
  /home/jovyan/.condor/local/execute/dir_492/condor_exec.exe
  /home/jovyan/.condor/local/execute/dir_492/.update.ad

Exception and traceback (most recent call last):
  File "<ipython-input-6-769ac4dfb4b6>", line 3, in inverse
    return 1 / x

    Local variables:
      x = 0

  ZeroDivisionError: division by zero

===========  End error report for component 0 of map firm-vast-oven  ===========

==========  Start error report for component 1 of map firm-vast-oven  ==========
Landed on execute node 1bea834c10a5 (172.17.0.2) at 2020-05-21 17:45:44.216714

Python executable is /opt/conda/bin/python3 (version 3.7.6)
with installed packages
  alembic==1.4.2
  async-generator==1.10
  attrs==19.3.0
  backcall==0.1.0
  bleach==3.1.4
  blinker==1.4
  brotlipy==0.7.0
  certifi==2020.4.5.1
  certipy==0.1.3
  cffi==1.14.0
  chardet==3.0.4
  click==7.1.2
  click-didyoumean==0.0.3
  cloudpickle==1.4.1
  colorama==0.4.3
  conda==4.8.2
  conda-package-handling==1.6.0
  cryptography==2.9.2
  cursor==1.3.4
  decorator==4.4.2
  defusedxml==0.6.0
  entrypoints==0.3
  halo==0.0.29
  htchirp==1.0
  htcondor==8.9.6
  -e git+https://github.com/htcondor/htmap.git@e0fd6de94fcad0295ae674e5479fac51cf57f34f#egg=htmap
  idna==2.9
  importlib-metadata==1.6.0
  ipykernel==5.2.1
  ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1588362967322/work
  ipython-genutils==0.2.0
  jedi==0.17.0
  Jinja2==2.11.2
  json5==0.9.0
  jsonschema==3.2.0
  jupyter-client==6.1.3
  jupyter-core==4.6.3
  jupyter-telemetry==0.0.5
  jupyterhub==1.1.0
  jupyterlab==2.1.1
  jupyterlab-server==1.1.1
  log-symbols==0.0.14
  Mako==1.1.0
  MarkupSafe==1.1.1
  mistune==0.8.4
  nbconvert==5.6.1
  nbformat==5.0.6
  nbstripout==0.3.7
  notebook==6.0.3
  oauthlib==3.0.1
  pamela==1.0.0
  pandocfilters==1.4.2
  parso==0.7.0
  pexpect==4.8.0
  pickleshare==0.7.5
  prometheus-client==0.7.1
  prompt-toolkit==3.0.5
  ptyprocess==0.6.0
  pycosat==0.6.3
  pycparser==2.20
  pycurl==7.43.0.5
  Pygments==2.6.1
  PyJWT==1.7.1
  pyOpenSSL==19.1.0
  pyrsistent==0.16.0
  PySocks==1.7.1
  python-dateutil==2.8.1
  python-editor==1.0.4
  python-json-logger==0.1.11
  pyzmq==19.0.0
  requests==2.23.0
  ruamel-yaml==0.15.80
  ruamel.yaml.clib==0.2.0
  Send2Trash==1.5.0
  six==1.14.0
  spinners==0.0.24
  SQLAlchemy==1.3.16
  termcolor==1.1.0
  terminado==0.8.3
  testpath==0.4.4
  toml==0.10.0
  tornado==6.0.4
  tqdm==4.46.0
  traitlets==4.3.3
  urllib3==1.25.9
  wcwidth==0.1.9
  webencodings==0.5.1
  zipp==3.1.0

Scratch directory contents are
  /home/jovyan/.condor/local/execute/dir_487/.chirp.config
  /home/jovyan/.condor/local/execute/dir_487/_htmap_user_transfer
  /home/jovyan/.condor/local/execute/dir_487/.job.ad
  /home/jovyan/.condor/local/execute/dir_487/_condor_stderr
  /home/jovyan/.condor/local/execute/dir_487/.machine.ad
  /home/jovyan/.condor/local/execute/dir_487/func
  /home/jovyan/.condor/local/execute/dir_487/_condor_stdout
  /home/jovyan/.condor/local/execute/dir_487/_htmap_transfer
  /home/jovyan/.condor/local/execute/dir_487/1.in
  /home/jovyan/.condor/local/execute/dir_487/_htmap_do_output_transfer
  /home/jovyan/.condor/local/execute/dir_487/_htmap_transfer_plugin_cache
  /home/jovyan/.condor/local/execute/dir_487/condor_exec.exe
  /home/jovyan/.condor/local/execute/dir_487/.update.ad

Exception and traceback (most recent call last):
  File "<ipython-input-6-769ac4dfb4b6>", line 3, in inverse
    return 1 / x

    Local variables:
      x = 0

  ZeroDivisionError: division by zero

===========  End error report for component 1 of map firm-vast-oven  ===========

==========  Start error report for component 2 of map firm-vast-oven  ==========
Landed on execute node 1bea834c10a5 (172.17.0.2) at 2020-05-21 17:45:44.383019

Python executable is /opt/conda/bin/python3 (version 3.7.6)
with installed packages
  alembic==1.4.2
  async-generator==1.10
  attrs==19.3.0
  backcall==0.1.0
  bleach==3.1.4
  blinker==1.4
  brotlipy==0.7.0
  certifi==2020.4.5.1
  certipy==0.1.3
  cffi==1.14.0
  chardet==3.0.4
  click==7.1.2
  click-didyoumean==0.0.3
  cloudpickle==1.4.1
  colorama==0.4.3
  conda==4.8.2
  conda-package-handling==1.6.0
  cryptography==2.9.2
  cursor==1.3.4
  decorator==4.4.2
  defusedxml==0.6.0
  entrypoints==0.3
  halo==0.0.29
  htchirp==1.0
  htcondor==8.9.6
  -e git+https://github.com/htcondor/htmap.git@e0fd6de94fcad0295ae674e5479fac51cf57f34f#egg=htmap
  idna==2.9
  importlib-metadata==1.6.0
  ipykernel==5.2.1
  ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1588362967322/work
  ipython-genutils==0.2.0
  jedi==0.17.0
  Jinja2==2.11.2
  json5==0.9.0
  jsonschema==3.2.0
  jupyter-client==6.1.3
  jupyter-core==4.6.3
  jupyter-telemetry==0.0.5
  jupyterhub==1.1.0
  jupyterlab==2.1.1
  jupyterlab-server==1.1.1
  log-symbols==0.0.14
  Mako==1.1.0
  MarkupSafe==1.1.1
  mistune==0.8.4
  nbconvert==5.6.1
  nbformat==5.0.6
  nbstripout==0.3.7
  notebook==6.0.3
  oauthlib==3.0.1
  pamela==1.0.0
  pandocfilters==1.4.2
  parso==0.7.0
  pexpect==4.8.0
  pickleshare==0.7.5
  prometheus-client==0.7.1
  prompt-toolkit==3.0.5
  ptyprocess==0.6.0
  pycosat==0.6.3
  pycparser==2.20
  pycurl==7.43.0.5
  Pygments==2.6.1
  PyJWT==1.7.1
  pyOpenSSL==19.1.0
  pyrsistent==0.16.0
  PySocks==1.7.1
  python-dateutil==2.8.1
  python-editor==1.0.4
  python-json-logger==0.1.11
  pyzmq==19.0.0
  requests==2.23.0
  ruamel-yaml==0.15.80
  ruamel.yaml.clib==0.2.0
  Send2Trash==1.5.0
  six==1.14.0
  spinners==0.0.24
  SQLAlchemy==1.3.16
  termcolor==1.1.0
  terminado==0.8.3
  testpath==0.4.4
  toml==0.10.0
  tornado==6.0.4
  tqdm==4.46.0
  traitlets==4.3.3
  urllib3==1.25.9
  wcwidth==0.1.9
  webencodings==0.5.1
  zipp==3.1.0

Scratch directory contents are
  /home/jovyan/.condor/local/execute/dir_488/.chirp.config
  /home/jovyan/.condor/local/execute/dir_488/_htmap_user_transfer
  /home/jovyan/.condor/local/execute/dir_488/.job.ad
  /home/jovyan/.condor/local/execute/dir_488/_condor_stderr
  /home/jovyan/.condor/local/execute/dir_488/.machine.ad
  /home/jovyan/.condor/local/execute/dir_488/func
  /home/jovyan/.condor/local/execute/dir_488/_condor_stdout
  /home/jovyan/.condor/local/execute/dir_488/_htmap_transfer
  /home/jovyan/.condor/local/execute/dir_488/2.in
  /home/jovyan/.condor/local/execute/dir_488/_htmap_do_output_transfer
  /home/jovyan/.condor/local/execute/dir_488/_htmap_transfer_plugin_cache
  /home/jovyan/.condor/local/execute/dir_488/condor_exec.exe
  /home/jovyan/.condor/local/execute/dir_488/.update.ad

Exception and traceback (most recent call last):
  File "<ipython-input-6-769ac4dfb4b6>", line 3, in inverse
    return 1 / x

    Local variables:
      x = 0

  ZeroDivisionError: division by zero

===========  End error report for component 2 of map firm-vast-oven  ===========

Unlike holds, you generally won’t want to re-run components that experienced errors (they’ll just fail again). Instead, an error is usually a signal that you’ve got a bug in your own code. Remove your map, debug the error locally, then create a new map.

Standard Output and Error

When handling trickier errors, you may need to look at the stdout and stderr from your map components. stdout and stderr are what you would see on the terminal if you executed your code locally - things like print and exceptions normally display their information there. HTMap provides access to stdout and stderr for each component through the appropriately-named attributes of your maps:

[10]:
import sys

@htmap.mapped
def stdx(_):
    print("Hi from stdout!")  # stdout is the default
    print("Hi from stderr!", file = sys.stderr)

m = stdx.map([None])
Created map quick-calm-stream with 1 components
[11]:
m.stdout.get(0)  # get will wait for the stdout to become available, m.stdout[0] wouldn't
[11]:
Landed on execute node 1bea834c10a5 (172.17.0.2) at 2020-05-21 17:45:47.056114 as jovyan

Scratch directory contents before run:
|- .chirp.config
|- .job.ad
|- .machine.ad
|- .update.ad
|- 0.in
|- _condor_stderr
|- _condor_stdout
|- _htmap_do_output_transfer
|- * _htmap_transfer
|- * _htmap_transfer_plugin_cache
|- * _htmap_user_transfer
|  \- * 0
|- condor_exec.exe
\- func

Python executable is /opt/conda/bin/python3 (version 3.7.6)
with installed packages
  alembic==1.4.2
  async-generator==1.10
  attrs==19.3.0
  backcall==0.1.0
  bleach==3.1.4
  blinker==1.4
  brotlipy==0.7.0
  certifi==2020.4.5.1
  certipy==0.1.3
  cffi==1.14.0
  chardet==3.0.4
  click==7.1.2
  click-didyoumean==0.0.3
  cloudpickle==1.4.1
  colorama==0.4.3
  conda==4.8.2
  conda-package-handling==1.6.0
  cryptography==2.9.2
  cursor==1.3.4
  decorator==4.4.2
  defusedxml==0.6.0
  entrypoints==0.3
  halo==0.0.29
  htchirp==1.0
  htcondor==8.9.6
  -e git+https://github.com/htcondor/htmap.git@e0fd6de94fcad0295ae674e5479fac51cf57f34f#egg=htmap
  idna==2.9
  importlib-metadata==1.6.0
  ipykernel==5.2.1
  ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1588362967322/work
  ipython-genutils==0.2.0
  jedi==0.17.0
  Jinja2==2.11.2
  json5==0.9.0
  jsonschema==3.2.0
  jupyter-client==6.1.3
  jupyter-core==4.6.3
  jupyter-telemetry==0.0.5
  jupyterhub==1.1.0
  jupyterlab==2.1.1
  jupyterlab-server==1.1.1
  log-symbols==0.0.14
  Mako==1.1.0
  MarkupSafe==1.1.1
  mistune==0.8.4
  nbconvert==5.6.1
  nbformat==5.0.6
  nbstripout==0.3.7
  notebook==6.0.3
  oauthlib==3.0.1
  pamela==1.0.0
  pandocfilters==1.4.2
  parso==0.7.0
  pexpect==4.8.0
  pickleshare==0.7.5
  prometheus-client==0.7.1
  prompt-toolkit==3.0.5
  ptyprocess==0.6.0
  pycosat==0.6.3
  pycparser==2.20
  pycurl==7.43.0.5
  Pygments==2.6.1
  PyJWT==1.7.1
  pyOpenSSL==19.1.0
  pyrsistent==0.16.0
  PySocks==1.7.1
  python-dateutil==2.8.1
  python-editor==1.0.4
  python-json-logger==0.1.11
  pyzmq==19.0.0
  requests==2.23.0
  ruamel-yaml==0.15.80
  ruamel.yaml.clib==0.2.0
  Send2Trash==1.5.0
  six==1.14.0
  spinners==0.0.24
  SQLAlchemy==1.3.16
  termcolor==1.1.0
  terminado==0.8.3
  testpath==0.4.4
  toml==0.10.0
  tornado==6.0.4
  tqdm==4.46.0
  traitlets==4.3.3
  urllib3==1.25.9
  wcwidth==0.1.9
  webencodings==0.5.1
  zipp==3.1.0

Running component 0
  <function stdx at 0x146c42004680>
with args
  (None,)
and kwargs
  {}

----- MAP COMPONENT OUTPUT START -----

Hi from stdout!

-----  MAP COMPONENT OUTPUT END  -----

Finished executing component at 2020-05-21 17:45:47.256167

Scratch directory contents after run:
|- .chirp.config
|- .job.ad
|- .machine.ad
|- .update.ad
|- 0.in
|- _condor_stderr
|- _condor_stdout
|- * _htmap_current_checkpoint
|- _htmap_do_output_transfer
|- * _htmap_transfer
|  \- 0.out
|- * _htmap_transfer_plugin_cache
|- * _htmap_user_transfer
|  \- * 0
|- condor_exec.exe
\- func

Note that much of the same information from the error report is included in the component stdout for convenience.

[12]:
m.stderr.get(0)
[12]:
Hi from stderr!

These attributes are both iterable sequences, which means that you can do something like this:

[13]:
@htmap.mapped
def err(x):
    print(f"Hi from stderr! {x}", file = sys.stderr)

err_map = err.map(range(5))
err_map.wait(show_progress_bar = True)

for e in err_map.stderr:
    print(e)
green-happy-year:   0%|          | 0/5 [00:00<?, ?component/s]
Created map green-happy-year with 5 components
green-happy-year: 100%|##########| 5/5 [00:04<00:00,  1.25component/s]
Hi from stderr! 0

Hi from stderr! 1

Hi from stderr! 2

Hi from stderr! 3

Hi from stderr! 4


[ ]: