Multiple shared objects for tvm4j runtimes (android)



Based on my experiments, loading multiple (models) shared objects increases the runtime latency a lot, due to cross referencing the heap between the loaded models in a single tvm4j runtime. Is there any mechanism to make it faster inside the runtime? another option to somehow merge the .so s in a single one!



I am also interested in this, I’d like to implement merging of the shared objects because my current project requires 4 separate models to be generated.


I think we can load multiple models at time and keep them open for ever as long as the CPU/GPU resources are available.

I think merging .so may not make much of a sense here.


Of course it depends on the type of models and applications you’re using. For totally unrelated models, it doesn’t make sense and it’s not the point of my question! merge multiple .so s when there’re related models even based on a same computation graph is what I’m asking.


Can you elaborate on this ?


Sorry for my late reply! basically let’s say there’s a base model + a softmax. The softmax part is responsible for some classification and we want to use the extracted features from the base for later tasks if the classification was satisfactory. Or event we can have two separate models for these tasks where the result of one depends on the other one. In this case, there should be at least two shared libs and it’s not possible to merge them together.



I see multiple options

1: Could merge in compilation process by allowing from_ extended versions to add new import to existing.
2: Could merge at runtime to load second graph and join (if <shape, dtypes> of first graph out match with <shape, dtypes> of seconf graph input)


Thanks for the suggestions! I believe I’m using 2nd option now and and loading/creating graph runtime for the second model (.so, etc) on demand if that’s what you meant by merge at runtime.

I’m not aware of 1st option and couldn’t find it so you point me what that is exactly? example or something?


2nd option is interesting allow flexibility on target, but some challenges
1: Duplicate node names while loading 2nd model in extension to first one.
2: There might be some storage plan conflicts (just a doubt)

1st option is changing frond calls (from_onnx , from_mxnet …) to allow importing of second model in extension to first. This should be straight and easy.