Advanced search

Message boards : Graphics cards (GPUs) : error on windows 10 pro 195 (0xc3) EXIT_CHILD_FAILED

Author Message
Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 101
Credit: 1,368,051,552
RAC: 403,415
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61928 - Posted: 14 Nov 2024 | 10:19:33 UTC
Last modified: 14 Nov 2024 | 10:23:22 UTC

Hi all,
a lot of WU (acemd3 and ATMML) during last week stopped and abort during execution on different old gpu Nvidia (gtx750 ti with 2 GB, gtx 1060 with 3 GB and others)

error 195 (0xc3) EXIT_CHILD_FAILED in report

Any suggestion ?
I updated driver but no change to eliminate this problem.
Thanks in advance.
K.

Here the log .... my pc's are no visible :

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
(unknown error) (0) - exit code 195 (0xc3)</message>
<stderr_txt>
13:43:32 (7156): wrapper (7.9.26016): starting
13:43:32 (7156): wrapper: running Library/usr/bin/tar.exe (xjvf input.tar.bz2)
aceforce_dft_v0.4.ckpt
Acellera-AToM-OpenMM-gitrepo/
Acellera-AToM-OpenMM-gitrepo/sync/
Acellera-AToM-OpenMM-gitrepo/sync/worker.py
Acellera-AToM-OpenMM-gitrepo/sync/atm.py
Acellera-AToM-OpenMM-gitrepo/sync/__init__.py
Acellera-AToM-OpenMM-gitrepo/ommsystem.py
Acellera-AToM-OpenMM-gitrepo/environment.yml
Acellera-AToM-OpenMM-gitrepo/openmm_async_re.py
Acellera-AToM-OpenMM-gitrepo/rbfe_structprep.py
Acellera-AToM-OpenMM-gitrepo/rbfe_explicit_sync.py
Acellera-AToM-OpenMM-gitrepo/ommreplica.py
Acellera-AToM-OpenMM-gitrepo/utils/
Acellera-AToM-OpenMM-gitrepo/utils/logging.conf
Acellera-AToM-OpenMM-gitrepo/utils/AtomUtils.py
Acellera-AToM-OpenMM-gitrepo/utils/timer.py
Acellera-AToM-OpenMM-gitrepo/utils/singal_guard.py
Acellera-AToM-OpenMM-gitrepo/utils/__init__.py
Acellera-AToM-OpenMM-gitrepo/atom_nnp_wrapper.py
Acellera-AToM-OpenMM-gitrepo/local_openmm_transport.py
Acellera-AToM-OpenMM-gitrepo/temperatureRE_explicit.py
Acellera-AToM-OpenMM-gitrepo/examples/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/temoa-g1.prmtop
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/temoa-g1.inpcrd
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/equil.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/npt.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/temoa-g1_asyncre.cntl
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/README.md
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/mdlambda.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/temoa-g1/mintherm.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/scripts/nodefile
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/scripts/runopenmm
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/dmso.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/dap.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/thi.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/but.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/dss.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/dapp.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/ligands/prop.mol2
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/receptor/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/receptor/fkbp.pdb
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/README.md
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/setup-settings.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/mdlambda_template.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/equil_template.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/prep_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/mintherm_template.py
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/runopenmm
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/run_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/asyncre_template.cntl
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/setup-atm.sh
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/nodefile
Acellera-AToM-OpenMM-gitrepo/examples/ABFE/fkbp/scripts/free_energies_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/nodefile
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/TYK2_m01_m04.prmtop
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/TYK2_m01_m04.inpcrd
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/TYK2_m01_m04.pdb
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/TYK2_m01_m04_asyncre.cntl
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/TYK2_m01_m04/README.md
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/scripts/runopenmm
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/scripts/nodefile
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/ligands/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/ligands/3a.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/ligands/2e.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/ligands/2d.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/ligands/3b.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/nodefile
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/setup-atm.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/run_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/prep_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/setup-settings.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/free_energies_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/asyncre_template.cntl
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/mintherm_template.py
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/runopenmm
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/scripts/equil_template.py
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/receptor/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/receptor/eralpha.pdb
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/eralpha/README.md
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1Q.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OIU.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1Q.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OI9.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1S.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OIU.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1S.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OIY.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OIY.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1OI9.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1R.mol2
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/ligands/1H1R.sdf
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/receptor/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/receptor/cdk2.pdb
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/README.md
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/prep_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/free_energies_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/run_template.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/asyncre_template.cntl
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/setup-settings.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/cdk2/scripts/setup-atm.sh
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/temoa-g1-g4_asyncre.cntl
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/temoa-g1-g4.prmtop
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/temoa-g1-g4.inpcrd
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/README.md
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/mintherm.py
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/equil.py
Acellera-AToM-OpenMM-gitrepo/examples/RBFE/temoa-g1-g4/npt.py
Acellera-AToM-OpenMM-gitrepo/examples/README.md
Acellera-AToM-OpenMM-gitrepo/examples/scripts/
Acellera-AToM-OpenMM-gitrepo/examples/scripts/analyze.sh
Acellera-AToM-OpenMM-gitrepo/examples/scripts/runopenmm
Acellera-AToM-OpenMM-gitrepo/examples/scripts/uwham_analysis.R
Acellera-AToM-OpenMM-gitrepo/examples/scripts/nodefile
Acellera-AToM-OpenMM-gitrepo/.github/
Acellera-AToM-OpenMM-gitrepo/.github/workflows/
Acellera-AToM-OpenMM-gitrepo/.github/workflows/publish.yml
Acellera-AToM-OpenMM-gitrepo/abfe_explicit.py
Acellera-AToM-OpenMM-gitrepo/setup.py
Acellera-AToM-OpenMM-gitrepo/transport.py
Acellera-AToM-OpenMM-gitrepo/README.md
Acellera-AToM-OpenMM-gitrepo/gibbs_sampling.py
Acellera-AToM-OpenMM-gitrepo/rbfe_explicit_zrestr.py
Acellera-AToM-OpenMM-gitrepo/LICENSE
Acellera-AToM-OpenMM-gitrepo/abfe_structprep.py
Acellera-AToM-OpenMM-gitrepo/abfe_explicit_zrestr.py
Acellera-AToM-OpenMM-gitrepo/ommworker.py
Acellera-AToM-OpenMM-gitrepo/rbfe_explicit.py
atom.tar
QB_A03_A13_0.xml
QB_A03_A13_asyncre.cntl
QB_A03_A13.inpcrd
QB_A03_A13.prmtop
run_26_09_2024_14_32_815cc40a/
run_26_09_2024_14_32_815cc40a/A03_A13/
run_26_09_2024_14_32_815cc40a/A03_A13/.cache/
run_26_09_2024_14_32_815cc40a/A03_A13/.cache/fontconfig/
run_26_09_2024_14_32_815cc40a/A03_A13/.cache/matplotlib/
run_26_09_2024_14_32_815cc40a/A03_A13/run_26_09_2024_14_25_e5a8eacc/
run_26_09_2024_14_32_815cc40a/A03_A13/build_1/
run_26_09_2024_14_32_815cc40a/A03_A13/.config/
run_26_09_2024_14_32_815cc40a/A03_A13/.config/matplotlib/
run_26_09_2024_14_32_815cc40a/A03_A13/.htmd/
run_26_09_2024_14_32_815cc40a/A03_A13/scratch/
run.bat
run.sh
13:43:39 (7156): Library/usr/bin/tar.exe exited; CPU time 0.000000
13:43:39 (7156): wrapper: running C:/Windows/system32/cmd.exe (/c call Scripts\activate.bat && Scripts\conda-unpack.exe && run.bat)
DEPRECATION: Loading egg at c:\programdata\boinc\slots\3\lib\site-packages\openmmtorch-1.0-py3.11-win-amd64.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330
DEPRECATION: Loading egg at c:\programdata\boinc\slots\3\lib\site-packages\openmmtorch-1.0-py3.11-win-amd64.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330
Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
[W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function )
[W C:\cb\pytorch_1000000000000\work\third_party\nvfuser\csrc\graph_fuser.cpp:108] Warning: operator () profile_node %1201 : int[] = prim::profile_ivalue(%dims.28)
does not have profile information (function operator ())
[W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function )
Traceback (most recent call last):
File "C:\ProgramData\BOINC\slots\3\Scripts\rbfe_explicit_sync.py", line 11, in <module>
rx.scheduleJobs()
File "C:\ProgramData\BOINC\slots\3\Lib\site-packages\sync\atm.py", line 126, in scheduleJobs
self.worker.run(replica)
File "C:\ProgramData\BOINC\slots\3\Lib\site-packages\sync\worker.py", line 124, in run
raise RuntimeError(f"Simulation failed {ntry} times!")
RuntimeError: Simulation failed 5 times!
13:53:30 (7156): C:/Windows/system32/cmd.exe exited; CPU time 369.000000
13:53:30 (7156): app exit status: 0x16
13:53:30 (7156): called boinc_finish(195)
0 bytes in 0 Free Blocks.
546 bytes in 8 Normal Blocks.
1144 bytes in 1 CRT Blocks.
0 bytes in 0 Ignore Blocks.
0 bytes in 0 Client Blocks.
Largest number used: 0 bytes.
Total allocations: 3447821 bytes.
Dumping objects ->
{12410275} normal block at 0x0000026061DA12F0, 48 bytes long.
Data: <PATH=C:\ProgramD> 50 41 54 48 3D 43 3A 5C 50 72 6F 67 72 61 6D 44
{12410264} normal block at 0x0000026061DA18A0, 48 bytes long.
Data: <HOME=C:\ProgramD> 48 4F 4D 45 3D 43 3A 5C 50 72 6F 67 72 61 6D 44
{12410253} normal block at 0x0000026061DA1600, 48 bytes long.
Data: <TMP=C:\ProgramDa> 54 4D 50 3D 43 3A 5C 50 72 6F 67 72 61 6D 44 61
{12410242} normal block at 0x0000026061DA1EC0, 48 bytes long.
Data: <TEMP=C:\ProgramD> 54 45 4D 50 3D 43 3A 5C 50 72 6F 67 72 61 6D 44
{12410231} normal block at 0x0000026061DA1590, 48 bytes long.
Data: <TMPDIR=C:\Progra> 54 4D 50 44 49 52 3D 43 3A 5C 50 72 6F 67 72 61
{12410200} normal block at 0x0000026061F6E410, 64 bytes long.
Data: <PATH=C:\ProgramD> 50 41 54 48 3D 43 3A 5C 50 72 6F 67 72 61 6D 44
{12410189} normal block at 0x0000026060383550, 145 bytes long.
Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65
..\api\boinc_api.cpp(309) : {12410186} normal block at 0x000002606035F610, 8 bytes long.
Data: < -`` > 00 00 2D 60 60 02 00 00
{12409362} normal block at 0x0000026060383F10, 145 bytes long.
Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65
{12408594} normal block at 0x000002606035F1B0, 8 bytes long.
Data: < 0&#240;a` > 00 30 F0 61 60 02 00 00
..\zip\boinc_zip.cpp(122) : {295} normal block at 0x000002606035FAB0, 260 bytes long.
Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
{280} normal block at 0x0000026060344DB0, 80 bytes long.
Data: </c call Scripts\> 2F 63 20 63 61 6C 6C 20 53 63 72 69 70 74 73 5C
{279} normal block at 0x0000026060355A80, 16 bytes long.
Data: < 6`` > 98 0F 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{278} normal block at 0x0000026060355CB0, 16 bytes long.
Data: <p 6`` > 70 0F 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{277} normal block at 0x0000026060355440, 16 bytes long.
Data: <H 6`` > 48 0F 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{276} normal block at 0x0000026060355BC0, 16 bytes long.
Data: < 6`` > 20 0F 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{275} normal block at 0x00000260603553F0, 16 bytes long.
Data: <&#248; 6`` > F8 0E 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{274} normal block at 0x00000260603553A0, 16 bytes long.
Data: <&#208; 6`` > D0 0E 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{273} normal block at 0x0000026060359F90, 48 bytes long.
Data: <ComSpec=C:\Windo> 43 6F 6D 53 70 65 63 3D 43 3A 5C 57 69 6E 64 6F
{272} normal block at 0x00000260603555D0, 16 bytes long.
Data: <HM4`` > 48 4D 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{271} normal block at 0x000002606035B860, 32 bytes long.
Data: <SystemRoot=C:\Wi> 53 79 73 74 65 6D 52 6F 6F 74 3D 43 3A 5C 57 69
{270} normal block at 0x0000026060355350, 16 bytes long.
Data: < M4`` > 20 4D 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{268} normal block at 0x0000026060355B70, 16 bytes long.
Data: <&#248;L4`` > F8 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{267} normal block at 0x0000026060355A30, 16 bytes long.
Data: <&#208;L4`` > D0 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{266} normal block at 0x0000026060355260, 16 bytes long.
Data: <&#168;L4`` > A8 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{265} normal block at 0x000002606035F110, 16 bytes long.
Data: < L4`` > 80 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{264} normal block at 0x000002606035F430, 16 bytes long.
Data: <XL4`` > 58 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{263} normal block at 0x000002606035B800, 32 bytes long.
Data: <CUDA_DEVICE=0 PU> 43 55 44 41 5F 44 45 56 49 43 45 3D 30 00 50 55
{262} normal block at 0x000002606035F340, 16 bytes long.
Data: <0L4`` > 30 4C 34 60 60 02 00 00 00 00 00 00 00 00 00 00
{261} normal block at 0x0000026060344C30, 320 bytes long.
Data: <@&#243;5`` &#184;5`` > 40 F3 35 60 60 02 00 00 00 B8 35 60 60 02 00 00
{260} normal block at 0x000002606035F2F0, 16 bytes long.
Data: <&#176; 6`` > B0 0E 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{259} normal block at 0x000002606035F2A0, 16 bytes long.
Data: < 6`` > 88 0E 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{258} normal block at 0x000002606035B500, 32 bytes long.
Data: <C:/Windows/syste> 43 3A 2F 57 69 6E 64 6F 77 73 2F 73 79 73 74 65
{257} normal block at 0x000002606035F660, 16 bytes long.
Data: <` 6`` > 60 0E 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{256} normal block at 0x000002606035BCE0, 32 bytes long.
Data: <xjvf input.tar.b> 78 6A 76 66 20 69 6E 70 75 74 2E 74 61 72 2E 62
{255} normal block at 0x000002606035F0C0, 16 bytes long.
Data: <&#168; 6`` > A8 0D 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{254} normal block at 0x000002606035F480, 16 bytes long.
Data: < 6`` > 80 0D 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{253} normal block at 0x000002606035F250, 16 bytes long.
Data: <X 6`` > 58 0D 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{252} normal block at 0x000002606035F5C0, 16 bytes long.
Data: <0 6`` > 30 0D 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{251} normal block at 0x000002606035F200, 16 bytes long.
Data: < 6`` > 08 0D 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{250} normal block at 0x000002606035F520, 16 bytes long.
Data: <&#224; 6`` > E0 0C 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{248} normal block at 0x000002606035F750, 16 bytes long.
Data: <@&#158;5`` > 40 9E 35 60 60 02 00 00 00 00 00 00 00 00 00 00
{247} normal block at 0x0000026060359E40, 40 bytes long.
Data: <P&#247;5`` &#228;&#246;a` > 50 F7 35 60 60 02 00 00 10 E4 F6 61 60 02 00 00
{246} normal block at 0x000002606035F160, 16 bytes long.
Data: <&#192; 6`` > C0 0C 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{245} normal block at 0x000002606035F7A0, 16 bytes long.
Data: < 6`` > 98 0C 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{244} normal block at 0x000002606035B920, 32 bytes long.
Data: <Library/usr/bin/> 4C 69 62 72 61 72 79 2F 75 73 72 2F 62 69 6E 2F
{243} normal block at 0x000002606035F700, 16 bytes long.
Data: <p 6`` > 70 0C 36 60 60 02 00 00 00 00 00 00 00 00 00 00
{242} normal block at 0x0000026060360C70, 992 bytes long.
Data: < &#247;5`` &#185;5`` > 00 F7 35 60 60 02 00 00 20 B9 35 60 60 02 00 00
{86} normal block at 0x000002606035BC20, 32 bytes long.
Data: <windows_x86_64__> 77 69 6E 64 6F 77 73 5F 78 38 36 5F 36 34 5F 5F
{85} normal block at 0x0000026060355AD0, 16 bytes long.
Data: < &#156;5`` > 10 9C 35 60 60 02 00 00 00 00 00 00 00 00 00 00
{84} normal block at 0x0000026060359C10, 40 bytes long.
Data: <&#208;Z5`` &#188;5`` > D0 5A 35 60 60 02 00 00 20 BC 35 60 60 02 00 00
{63} normal block at 0x0000026060355080, 16 bytes long.
Data: < &#234;J&#147;&#247; > 80 EA 4A 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{62} normal block at 0x0000026060355030, 16 bytes long.
Data: <@&#233;J&#147;&#247; > 40 E9 4A 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{61} normal block at 0x0000026060355800, 16 bytes long.
Data: <&#248;WG&#147;&#247; > F8 57 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{60} normal block at 0x0000026060355850, 16 bytes long.
Data: <&#216;WG&#147;&#247; > D8 57 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{59} normal block at 0x0000026060354FE0, 16 bytes long.
Data: <P G&#147;&#247; > 50 04 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{58} normal block at 0x0000026060355580, 16 bytes long.
Data: <0 G&#147;&#247; > 30 04 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{57} normal block at 0x0000026060355C60, 16 bytes long.
Data: <&#224; G&#147;&#247; > E0 02 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{56} normal block at 0x0000026060355760, 16 bytes long.
Data: < G&#147;&#247; > 10 04 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{55} normal block at 0x00000260603551C0, 16 bytes long.
Data: <p G&#147;&#247; > 70 04 47 93 F7 7F 00 00 00 00 00 00 00 00 00 00
{54} normal block at 0x0000026060355EE0, 16 bytes long.
Data: < &#192;E&#147;&#247; > 18 C0 45 93 F7 7F 00 00 00 00 00 00 00 00 00 00
Object dump complete.

</stderr_txt>
]]>
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1629
Credit: 9,672,847,368
RAC: 7,829,637
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61930 - Posted: 14 Nov 2024 | 14:54:54 UTC

The significant part of that log is:

File "C:\ProgramData\BOINC\slots\3\Lib\site-packages\sync\worker.py", line 124, in run
raise RuntimeError(f"Simulation failed {ntry} times!")
RuntimeError: Simulation failed 5 times!
13:53:30 (7156): C:/Windows/system32/cmd.exe exited; CPU time 369.000000

incidentally revealing that you're running under Windows.

I saw the same error message yesterday in task 36614910. That's part of workunit 29941886 (third attempt).

I ran the same WU under Linux, and got a successful outcome. So it seems that there are (perhaps) still problems with the Windows app: there's not much you can do about that except be thankful that it didn't waste too much of your machine's time.

Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 101
Credit: 1,368,051,552
RAC: 403,415
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61933 - Posted: 14 Nov 2024 | 17:49:35 UTC - in response to Message 61930.

Hi, thanks a lot for your help.

i hope this problem will be solved by admin as soon as possible.

Naturally my NVIDIA gpu's will continue to run to help this project ...
also with some WU in error.

Ciao

K.
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 101
Credit: 1,368,051,552
RAC: 403,415
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61954 - Posted: 25 Nov 2024 | 7:12:26 UTC - in response to Message 61933.
Last modified: 25 Nov 2024 | 7:13:15 UTC

Hi, thanks a lot for your help.

i hope this problem will be solved by admin as soon as possible.

Naturally my NVIDIA gpu's will continue to run to help this project ...
also with some WU in error.

Ciao

K.


Hi all,
after 11 days from this problem, no news about it.
Is't only a my problem ?
Thanks in advance for any help ... specially from Barcellona.
K.
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1372
Credit: 7,994,279,654
RAC: 2,580,621
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61955 - Posted: 25 Nov 2024 | 8:02:49 UTC - in response to Message 61954.

Since you hide you computers it is impossible to really help you since we can't see your OS, driver version of type of cards.

But from your prior post, you won't be able to run the 750 Ti on any kind of task other than acemd3 since is doesn't meet the CUDA Compatibility of 6.0 minimum for any other tasks like the ATMML you have been attempting apparently.

Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 101
Credit: 1,368,051,552
RAC: 403,415
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61957 - Posted: 25 Nov 2024 | 10:49:08 UTC - in response to Message 61955.
Last modified: 25 Nov 2024 | 11:35:35 UTC

Hi,
thanks a lot for your help, i wrote the log of error in my first post.
The wu's under Windows 10 pro are not running only on gtx 750 TI cards but also on gtx 1060 and rtx 3050.
About 50% abort with same error .... i'll stop to help this project as soon as possible.
See you later, thanks again
K.
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2363
Credit: 16,529,230,689
RAC: 3,228,704
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61958 - Posted: 25 Nov 2024 | 15:54:48 UTC
Last modified: 25 Nov 2024 | 15:55:26 UTC

These ATMMLs error out sometimes with obscure reasons:

Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead. [W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function ) [W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function ) double free or corruption (!prev) run.sh: line 24: 11774 Aborted (core dumped) python bin/rbfe_explicit_sync.py $CONFIG_FILE 2024-11-25 05:01:49 (11717): bin/bash exited; CPU time 10548.001464 2024-11-25 05:01:49 (11717): app exit status: 0x86 2024-11-25 05:01:49 (11717): called boinc_finish(195)
The meaning of "double free or corruption (!prev)" is unclear for me.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1083
Credit: 40,330,187,595
RAC: 4,846,684
Level
Trp
Scientific publications
wat
Message 61959 - Posted: 25 Nov 2024 | 16:33:43 UTC - in response to Message 61958.
Last modified: 25 Nov 2024 | 16:40:42 UTC

These ATMMLs error out sometimes with obscure reasons:
Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead. [W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function ) [W output_modules.py:45] Warning: CUDA graph capture will lock the batch to the current number of samples (2). Changing this will result in a crash (function ) double free or corruption (!prev) run.sh: line 24: 11774 Aborted (core dumped) python bin/rbfe_explicit_sync.py $CONFIG_FILE 2024-11-25 05:01:49 (11717): bin/bash exited; CPU time 10548.001464 2024-11-25 05:01:49 (11717): app exit status: 0x86 2024-11-25 05:01:49 (11717): called boinc_finish(195)
The meaning of "double free or corruption (!prev)" is unclear for me.


its a memory allocation issue. "double" is a data type meaning double precision, or 64bits floating point (FP64). probably nothing to do with you or your computer, more likely an error in the code for that task. or just a random memory error that could be caused by a myriad of things (even cosmic rays) since none of it is ECC. i wouldnt worry about it since this is the only error of that kind on your host.
____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2363
Credit: 16,529,230,689
RAC: 3,228,704
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61960 - Posted: 26 Nov 2024 | 11:17:15 UTC - in response to Message 61959.

It's not the cosmic rays.
https://www.gpugrid.net/result.php?resultid=36832763

double free or corruption (!prev) run.sh: line 24: 21888 Aborted (core dumped) python bin/rbfe_explicit_sync.py $CONFIG_FILE 2024-11-26 06:40:02 (21831): bin/bash exited; CPU time 7931.535572 2024-11-26 06:40:02 (21831): app exit status: 0x86 2024-11-26 06:40:02 (21831): called boinc_finish(195)

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1083
Credit: 40,330,187,595
RAC: 4,846,684
Level
Trp
Scientific publications
wat
Message 61961 - Posted: 26 Nov 2024 | 12:38:43 UTC - in response to Message 61960.
Last modified: 26 Nov 2024 | 12:39:38 UTC

"could be caused by a myriad of things"

if you think it's related to your host, you could try testing the memory or replacing the memory.
____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2363
Credit: 16,529,230,689
RAC: 3,228,704
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61962 - Posted: 26 Nov 2024 | 16:55:39 UTC - in response to Message 61961.

Thanks for the suggestion. These are some older Kingston Predator RAM modules, "overclocked" to 2666MHz, I'll wait for the current WU to finish, and then I'll set the RAM speed and voltage to default.

Post to thread

Message boards : Graphics cards (GPUs) : error on windows 10 pro 195 (0xc3) EXIT_CHILD_FAILED

//