Commit Graph

617 Commits

Author SHA1 Message Date
Glenn Jocher e35397ee41 updates 2019-12-08 17:52:44 -08:00
Glenn Jocher 267367b105 updates 2019-12-08 16:34:27 -08:00
Glenn Jocher 0fa4e498c1 updates 2019-12-08 16:34:01 -08:00
Glenn Jocher b81beb0f5f updates 2019-12-07 22:55:26 -08:00
Glenn Jocher 1f943e886f updates 2019-12-07 15:17:29 -08:00
Glenn Jocher 55ba979816 updates 2019-12-07 01:26:41 -08:00
Glenn Jocher bb54408f73 updates 2019-12-07 00:05:37 -08:00
Glenn Jocher d5176e4fc4 updates 2019-12-07 00:01:18 -08:00
Glenn Jocher 2c0985f366 updates 2019-12-06 23:58:47 -08:00
Glenn Jocher a066a7b8ea updates 2019-12-06 19:05:51 -08:00
Glenn Jocher 63c2736c12 updates 2019-12-04 23:02:32 -08:00
Glenn Jocher 0a04eb9ff1 updates 2019-12-04 15:15:42 -08:00
Glenn Jocher a2dc8a6b5a updates 2019-12-04 15:15:23 -08:00
Glenn Jocher 93a70d958a updates 2019-12-02 11:31:19 -08:00
Glenn Jocher 3d91731519 updates 2019-12-01 14:07:09 -08:00
Glenn Jocher e637ae44dd updates 2019-12-01 14:06:11 -08:00
Glenn Jocher d6a7a614dc updates 2019-12-01 13:51:55 -08:00
Glenn Jocher 92690302bb updates 2019-12-01 13:49:38 -08:00
Glenn Jocher e613bbc88c updates 2019-11-29 19:10:01 -08:00
Glenn Jocher 9e9a6a1425 updates 2019-11-27 15:50:29 -10:00
Glenn Jocher 82b62c9855 updates 2019-11-27 15:50:00 -10:00
Glenn Jocher 3c57ff7b1b updates 2019-11-25 17:24:05 -10:00
Glenn Jocher 75e8ec323f updates 2019-11-25 11:45:28 -10:00
Francisco Reveriano 26e3a28bee Update train.py for distributive programming (#655)
When attempting to running this function in a multi-GPU environment I kept on getting a runtime issue. I was able to solve this problem by passing this keyword. I first found the solution here: 
https://github.com/pytorch/pytorch/issues/22436
and in the pytorch tutorial

'RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel; (2) making sure all forward function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). '
2019-11-24 22:21:36 -10:00
Glenn Jocher 7773651e8e updates 2019-11-24 18:38:30 -10:00
Glenn Jocher f12a2a513a updates 2019-11-24 18:29:29 -10:00
Glenn Jocher f38723c0bd updates 2019-11-20 19:34:22 -08:00
Glenn Jocher 3a4ed8b3ab updates 2019-11-20 13:40:24 -08:00
Glenn Jocher bb209111c4 updates 2019-11-20 13:36:15 -08:00
Glenn Jocher 8e327e3bd0 updates 2019-11-20 13:33:25 -08:00
Glenn Jocher 2950f4c816 updates 2019-11-20 13:26:50 -08:00
Glenn Jocher c14ea59c71 updates 2019-11-20 13:24:50 -08:00
Glenn Jocher bd498ae776 updates 2019-11-20 13:14:24 -08:00
Glenn Jocher e58f0a68b6 updates 2019-11-20 12:05:40 -08:00
Glenn Jocher d355e539d9 updates 2019-11-19 18:47:22 -08:00
Glenn Jocher d9805d2fb6 updates 2019-11-19 12:42:12 -08:00
Glenn Jocher 2ba1a4c9cc updates 2019-11-18 12:01:17 -08:00
Glenn Jocher 9c716a39c3 updates 2019-11-17 19:00:12 -08:00
Glenn Jocher a1151c04a7 updates 2019-11-17 18:48:50 -08:00
Glenn Jocher fe9ade6a64 updates 2019-11-16 12:07:19 -08:00
Glenn Jocher 985006a52a updates 2019-11-14 17:25:29 -08:00
Glenn Jocher 9daa5e858a updates 2019-11-14 17:22:09 -08:00
Glenn Jocher fedc2150b3 updates 2019-11-14 17:12:55 -08:00
Glenn Jocher 6047be35cf updates 2019-11-14 15:08:58 -08:00
Glenn Jocher a96e010251 updates 2019-11-14 15:07:27 -08:00
Glenn Jocher 579fdc57f8 updates 2019-11-09 10:56:38 -08:00
Glenn Jocher 97ac36ec6c updates 2019-11-08 10:19:46 -08:00
Glenn Jocher d0e000b008 updates 2019-11-07 20:11:03 -08:00
Glenn Jocher 09ca721f88 updates 2019-11-06 10:10:53 -08:00
Glenn Jocher f7f8bb23c2 updates 2019-11-04 16:34:45 -08:00