Assembla home | Assembla project page
 

Ticket #150 (closed defect: fixed)

Opened 2 months ago

Last modified 1 month ago

TPS counter inconsistencies in lj_liquid.bmark

Reported by: ldpaniak Assigned to: joaander
Priority: minor Milestone:
Component: Benchmarks Keywords:
Cc:

Description

During a long run of the lj_liquid benchmark on two GTX 280 GPUs, I noticed some silliness with the TPS counter:

Time 00:30:00 | Step 71778 / 504000 | TPS 37.53 | ETA 03:11:56
Time 00:30:10 | Step 72155 / 504000 | TPS 37.68 | ETA 03:11:01
Time 00:30:20 | Step 72678 / 504000 | TPS 52.27 | ETA 02:17:31
Time 00:30:30 | Step 73435 / 504000 | TPS 75.65 | ETA 01:34:51
Time 00:30:40 | Step 74155 / 504000 | TPS 71.97 | ETA 01:39:32
Time 00:30:50 | Step 74914 / 504000 | TPS 75.81 | ETA 01:34:19

and later on in the same run:

Time 00:41:07 | Step 120412 / 504000 | TPS 74.54 | ETA 01:25:45
Time 00:41:17 | Step 121118 / 504000 | TPS 70.59 | ETA 01:30:23
Time 00:41:27 | Step 121861 / 504000 | TPS 74.23 | ETA 01:25:48
Time 00:41:37 | Step 122552 / 504000 | TPS 69.02 | ETA 01:32:06
Time 00:41:47 | Step 123280 / 504000 | TPS 72.78 | ETA 01:27:10
Time 00:41:58 | Step 123969 / 504000 | TPS 68.43 | ETA 01:32:33
Time 00:42:08 | Step 124499 / 504000 | TPS 51.98 | ETA 02:01:40
Time 00:42:18 | Step 124781 / 504000 | TPS 28.04 | ETA 03:45:23
Time 00:42:28 | Step 125144 / 504000 | TPS 35.88 | ETA 02:55:59
Time 00:42:38 | Step 125428 / 504000 | TPS 28.23 | ETA 03:43:28
Time 00:42:48 | Step 125753 / 504000 | TPS 32.11 | ETA 03:16:21
Time 00:42:58 | Step 126089 / 504000 | TPS 33.57 | ETA 03:07:38
Time 00:43:08 | Step 126425 / 504000 | TPS 33.59 | ETA 03:07:20
Time 00:43:18 | Step 126763 / 504000 | TPS 33.74 | ETA 03:06:20

From previous, shorter runs, expected TPS is around 37. Looks like the TPS value is doubled.

Is there a problem with a counter wrapping somewhere, or not?

Attachments

Change History

10/02/08 16:17:56 changed by joaander

  • status changed from new to assigned.

How many particles has the benchmark been modified to run?

10/02/08 18:04:03 changed by ldpaniak

Sorry, lazy bug report:

HOOMD svnversion 1293 (gpu-reduce-comm branch)

lj_liquid_bmark with:

init.create_random(N = 675000 , phi_p = 0.2 , name =  A , min_dist = 0.7 )
pair.lj(r_cut = 3.0 )
***Warning! Virial data structure not yet implemented on the GPU
coeff.set( A , A , {'epsilon': 1.0, 'sigma': 1.0, 'alpha': 1.0} )
integrate.nvt(dt= 0.005 , T= 1.2 , tau= 0.5 )

Checked for syslog entries around counter fluctuation events - found none.

10/02/08 18:34:02 changed by joaander

OK, I'll see if I can reproduce the issue here on a single GTX 280. I'm going to wait until after I revamp the logging system (#110) so I can log a few useful quantities to help debug (namely, the temperature/average particle speed).

10/03/08 21:05:54 changed by joaander

Note to self: TODO list

  • Write nve limit script to avoid particles leaving the box
  • Log kinetic energy
  • Dump configurations to see if the sim runs correctly
  • Try also with kernel 2.6.26

10/13/08 13:32:49 changed by joaander

  • status changed from assigned to closed.
  • resolution set to fixed.

Solved in r1338 (see comment:ticket:156:30 )


Add/Change #150 (TPS counter inconsistencies in lj_liquid.bmark)




Action