Currently, the GlobalSharedTransformerCIFn, that we use for Jose, concatenates together the inputs to each module and then passes them through the transformer. I think it would be nicer if we also included the output of the final module, but that's a little more code and probably not much benefit.
Currently, the GlobalSharedTransformerCIFn, that we use for Jose, concatenates together the inputs to each module and then passes them through the transformer. I think it would be nicer if we also included the output of the final module, but that's a little more code and probably not much benefit.