Currently, the Neo4J database regeneration step is severely CPU limited, with all the work happening in the Neo4J process. As in, 30 minutes at 150% CPU load on average. This is not about the graph generation step at the end, which does not take very long. But rather about inserting the data retrieved via API. Given that this is (or: should be) a relatively simple operation, there are probably major ways to optimize it. Maybe batch insertion?
(This will not have to be solved when re-implementing the Graphryder API as a Discourse plugin, as then even Neo4J will not be needed anymore, replacing it with Redis which is already in use in Discourse.)
Currently, the Neo4J database regeneration step is severely CPU limited, with all the work happening in the Neo4J process. As in, 30 minutes at 150% CPU load on average. This is not about the graph generation step at the end, which does not take very long. But rather about inserting the data retrieved via API. Given that this is (or: should be) a relatively simple operation, there are probably major ways to optimize it. Maybe batch insertion?
(This will not have to be solved when re-implementing the Graphryder API as a Discourse plugin, as then even Neo4J will not be needed anymore, replacing it with Redis which is already in use in Discourse.)