07-06-2019 08:10 AM
I have a project written with SystemVerilog, many constant function are defined and called in source codes to make project configurable with parameters.
But when I open it in Vivado 2018.3 or Vivado 2019.1, the hierachy refreshing will take a long time(almost 1 hour long).In Windows taskManager, two running processes named by 'srcscanner.exe' can be found and they are continuing allocating memories. The memory size used by each 'srcscanner.exe' can increase to up to 40GigaBytes (including active memory and pagefile in disk). This issue is so horrible that I can't work properly with Vivado 2018.3 or Vivado 2019.1.
But Vivado 2017.4 can open my project properly, and hierachy refreshing will take only 2 or 3 minutes. In Windows taskManager, the 'srcscanner.exe' can be found to use up to 150MegaBytes memory.
I have tried to split source code into pieces to find which part of source code caused srcscanner.exe to use so many memories in Vivado 2018.3, but I can only get a subset of source code whose hierachy refreshing can be finished normally, when an extra module named by 'mux_byidx'(the module source code is posed in the end of this topic) is added, the long time and huge memory hierachy refreshing will come back again.
As a contrast, Vivado 2017.4 will complete hierachy refreshing normally.
I can't post all my project onto this message because of securrity. I can only post the source code of 'mux_byidx' which seems to make srcscanner.exe goes wrong.
The extra phenomenon I found is, the more instance of module 'mux_byidx' instantiated in project, the more memory and more time will be used by srcscanner.exe in Vivado 2018.3 and Vivado 2019.1. So I guess maybe srcscanner.exe has used same algorithm and functionality on parsing modules and constant functions, and the instance of constant functions are not destroyed after they are used and remained in memory, so the more constant functions are called the more memories are used and can't be freeed, the horrible memory usage in Vivado 2018.3 and 2019.1 is comming.
wish my guess will help you to fix this issue.
related source code of module 'mux_byidx'
module pipedelay_taps_packedarray #( parameter int DATABITW = 8, parameter int ARRAYSIZ = 2, parameter bit REVERSE_BITORDER = 0, parameter int signed DELAYTAPS = 0 ) (clk, aclr, sclr, enable, x, pipe_x); input bit clk; input wire aclr; input wire sclr; input wire enable; localparam int topbitidx = REVERSE_BITORDER ? 0 : DATABITW - 1; localparam int btmbitidx = REVERSE_BITORDER ? DATABITW - 1 : 0; input wire [ARRAYSIZ-1:0][topbitidx:btmbitidx] x; output logic[ARRAYSIZ-1:0][topbitidx:btmbitidx] pipe_x; generate if (DELAYTAPS > 0) begin logic[ARRAYSIZ-1:0][topbitidx:btmbitidx] pipe_array[DELAYTAPS-1:0]; always_ff @(`CLKTABLE_POSEDGE_ASYNC_CLR(clk, aclr)) begin for (int i = 0; i < DELAYTAPS; i++) begin if (aclr) pipe_array[i] <= '0; else if (sclr) pipe_array[i] <= '0; else if (enable) pipe_array[i] <= (i == 0) ? x : pipe_array[i-1]; else pipe_array[i] <= pipe_array[i]; end end assign pipe_x = pipe_array[DELAYTAPS-1]; end else begin assign pipe_x = x; end endgenerate endmodule module pipedelay_taps #( parameter int DATABITW = 8, parameter bit REVERSE_BITORDER = 0, parameter int signed DELAYTAPS = 0 ) (clk, aclr, sclr, enable, x, pipe_x); input bit clk; input wire aclr; input wire sclr; input wire enable; localparam int topbitidx = REVERSE_BITORDER ? 0 : DATABITW - 1; localparam int btmbitidx = REVERSE_BITORDER ? DATABITW - 1 : 0; input wire [topbitidx:btmbitidx]x; output logic[topbitidx:btmbitidx]pipe_x; wire[0:0][topbitidx:btmbitidx]xx, pipe_xx; assign xx = x; pipedelay_taps_packedarray #( .DATABITW(DATABITW), .ARRAYSIZ(1), .REVERSE_BITORDER(REVERSE_BITORDER),.DELAYTAPS(DELAYTAPS) ) pdtpi( .clk(clk), .aclr(aclr),.sclr(sclr),.enable(enable), .x(xx), .pipe_x(pipe_xx) ); assign pipe_x = pipe_xx; endmodule package mux_pkg; function automatic int bits_of_integer(int unsigned value, int maxbits); for (int i = 1; i < maxbits; i++) begin if (value < 2**i) return i; end return maxbits; endfunction function automatic int idxbitw_ofmux(int inputcnt); if (inputcnt <= 0) return 1; return bits_of_integer(inputcnt - 1, 31); endfunction function automatic int delaytaps4stage(int stagecnt, int istage, int totaltaps, bit top_first); int judge_0, judge_1; if (top_first) begin judge_1 = (istage + 0)*totaltaps/stagecnt; judge_0 = (istage + 1)*totaltaps/stagecnt; end else begin judge_1 = (stagecnt - (istage + 1))*totaltaps/stagecnt; judge_0 = (stagecnt - (istage + 0))*totaltaps/stagecnt; end return judge_0 - judge_1; endfunction endpackage module muxinput_fit2idx #( parameter int UNITBITW = 8, parameter int INPUTCNT = 5, parameter int IDXBITW = 2 ) ( input wire [UNITBITW-1:0] data_in[INPUTCNT-1:0], input wire [UNITBITW-1:0] data4nocs, output wire [UNITBITW-1:0] data_out[2**IDXBITW-1:0] ); initial if (INPUTCNT > 2**IDXBITW) $error("muxinput_fit2idx : parameter INPUTCNT(%0d) should not be greator than which can be hold(%0d) by index of IDXBITW(%0d) bits", INPUTCNT, 2**IDXBITW, IDXBITW); genvar i; generate for (i = 0; i < (2**IDXBITW); i++) begin: FILLINPUT if (i < INPUTCNT) assign data_out[i] = data_in[i]; else assign data_out[i] = data4nocs; end endgenerate endmodule module mux_byidx #( parameter int UNITBITW = 8, parameter int INPUTCNT = 5, parameter int DELAYTAPS = 0 ) (clk, aclr, sclr,clken, data_in, data4nocs, idx, data_out); input bit clk; input wire aclr; input wire sclr; input wire clken; input wire[UNITBITW-1:0] data_in[INPUTCNT-1:0]; input wire[UNITBITW-1:0] data4nocs; localparam int idxbitw = mux_pkg::idxbitw_ofmux(INPUTCNT); input wire[idxbitw -1:0] idx; output wire[UNITBITW-1:0] data_out; wire[UNITBITW-1:0]data2sel[2**idxbitw-1:0]; muxinput_fit2idx #( .UNITBITW(UNITBITW), .INPUTCNT(INPUTCNT), .IDXBITW(idxbitw) ) input_fix2idx( .data_in(data_in), .data4nocs(data4nocs),.data_out(data2sel) ); localparam int taps2mux = DELAYTAPS; genvar i,j,k; generate for (i = idxbitw-1; i >= 0; i--) begin: STAGE localparam int delaytaps_stage = miscs::delaytaps4stage(idxbitw, i, taps2mux, 1'b0); logic[UNITBITW-1:0] stage_in [(2**(i+1))-1:0]; logic[UNITBITW-1:0] stage_out[(2**i)-1:0]; for (j = 0; j < 2**(i+1); j++) begin: INPUT if (i == idxbitw - 1)assign stage_in[j] = data2sel[j]; else assign stage_in[j] = STAGE[i+1].stage_out[j]; end: INPUT logic [i:0] idx_in; if (i > 0) begin: IDX2NS logic[i-1:0]idx_out; pipedelay_taps #( .DATABITW(i), .REVERSE_BITORDER(1'b0),.DELAYTAPS(delaytaps_stage) ) pipe_idx( .clk(clk), .aclr(aclr),.sclr(sclr),.enable(clken),.x(idx_in[i-1:0]),.pipe_x(idx_out) ); end: IDX2NS if (i == idxbitw - 1)assign idx_in = idx; else assign idx_in = STAGE[i+1].IDX2NS.idx_out; for (j = 0; j < 2**i; j++) begin: MUX_2_1 wire[UNITBITW-1:0] stage_2o = idx_in[i] ? stage_in[2**i + j] : stage_in[j]; pipedelay_taps #( .DATABITW(UNITBITW), .REVERSE_BITORDER(1'b0),.DELAYTAPS(delaytaps_stage) ) pipe_stage_out( .clk(clk), .aclr(aclr),.sclr(sclr),.enable(clken),.x(stage_2o), .pipe_x(stage_out[j]) ); end: MUX_2_1 if (i == 0) assign data_out = stage_out; end: STAGE endgenerate endmodule
07-06-2019 10:18 PM
Hi @mr_john ,
I have tried to add your provided RTL in Vivado but it throws errors of missing parameters'declaration.
Can you provide the archived testcase/project (small testcase) through ezmove ftp so i can reproduce it at my end?
After investigation on it, if it will be the bug then i can proceed to file the CR (change request) on it for fixing it in next version of Vivado.
Let me know if you can share it via ezmove ftp. To file a CR a testcase will be necessary to depict the issue.
07-08-2019 07:24 AM - edited 07-08-2019 08:10 AM
07-08-2019 07:29 AM
Hi @mr_john ,
If you can provide the exisiting project that also will be helpful to check it at our end.
07-08-2019 08:15 AM
07-09-2019 09:02 AM
@mr_john "the hierachy refreshing will take a long time(almost 1 hour long)."
srcscanner.exe has been known to cause these sorts of problems.
Have you tried any of the source_mgmt_mode workarounds from AR 69846?
07-10-2019 08:31 AM
07-10-2019 08:59 AM - edited 07-11-2019 06:00 AM
@mr_john "It seems that AR 69846 can not solve my problem."
OK. There was another Tcl workaround suggested on the forums that improved the behavior of the srcscanner process:
set_param project.hsv.draftModeDefault only
EDIT: there also appears to be another variant of this command that you could try:
set_param project.hsv.draftModeDefault never
07-10-2019 11:42 AM
A question for Xilinx - just what is this srcscanner application supposed to be doing? From all the problems its caused for many years in Vivado, I don't understand why it's still implemented at all. It appears to offer no benefit. We've had it disabled since since Vivado 2017.1, and have never re-enabled. I keep seeing all these (continued) problems on the forums regarding the utility, and I'm glad we have the ability to just disable it. One must wonder, just what benefit the thing is supposed to be giving us?
07-11-2019 09:41 PM
Hi @mr_john ,
I have tried to reproduce the issue of high runtime of hierarhcy refreshing by adding or removing an instance from top level but i did not able to reproduce the issue which you have mentioned.
With Vivado 2019.1 - I have tried with Windows 7 , 10 and linux as well. Both with windows OS - refreshing hierarchy took 10-20 seconds and srscanner took around 1 GB while the refreshing hierarchy was going on.
Can you check this link that you following it or not? https://www.xilinx.com/products/design-tools/vivado/memory.html
And also provide the environment report of your machine by running report_environment -file <filepath>/env.txt
07-12-2019 05:44 AM
@mr_john "It seems that AR 69846 can not solve my problem."
OK. There was another Tcl workaround suggested on the forums that improved the behavior of the srcscanner process:set_param project.hsv.draftModeDefault only
EDIT: there also appears to be another variant of this command that you could try:set_param project.hsv.draftModeDefault never
Thanks very much. I've tried all two commands you posted, but the issue was still remainning, same refreshing time and same memory usage. So it seems that these commands didn't work yet.
07-13-2019 07:49 AM
It seems that the forum had eaten my message that replying for you -_-!
As the contrast, I changed the source code in project I attached previosly, by only purging constant function alls in module 'mux_byidx', and the refreshing time was reduced to 3 to 5 seconds, and the maximum memory usage was reduced to 44 Megabytes. This experence figures that the issue is probably on instantiating constant function calls and without releasing instance after processing it. Please pay more attention on it.
The project I've changed is attached below, wish it can help you to figure out the issue.
07-17-2019 01:04 AM
Hi @mr_john ,
As mentioned in my last thread , this issue is not reproducible at my end.
It is a machine specific issue. Please provide environment report.
With new project as well, i am unable to reproduce it.
07-17-2019 09:23 AM - edited 07-18-2019 07:31 AM
This is the 3rd time I lost my message after posting. I spent about 2 hour to verify the project and edit the message, and all the content are lost after posting. I'm so disappointing on it.
I've produced another project to illustrate the issue, and attached it in the end of this message. The enviroment report is also attached as you wish.
In the project, I provide two parameter named with 'log2of_multimux_routcount' and 'ILLASTRATE_HORRIBLE_MEMORY_USAGE' in top module named with 'toptest' to control the project to produce the issue or not. On my computer, set 'ILLASTRATE_HORRIBLE_MEMORY_USAGE' to 1'b1 can reproduce the issue, and set it to 1'b0 can purge the issue. Set parameter 'log2of_multimux_routcount' to a greator value can produce the greator memory usage in hierachy refreshing.
For example, when I set to:
parameter int log2of_multimux_routcount = 2, parameter bit ILLASTRATE_HORRIBLE_MEMORY_USAGE = 1'b1
The maximum memory usage of about 1.52Gigabytes will be found for each srcscanner.exe process, and refreshing time will go on to about 100 seconds.
But when I set 'ILLASTRATE_HORRIBLE_MEMORY_USAGE' to 1'b0, only about 100 Megabytes of memory usage will be found for each srcscanner.exe, and refreshing time will be shorted to about 3 or 5 seconds.
The difference about setting 'ILLASTRATE_HORRIBLE_MEMORY_USAGE' to 1'b1 and 1'b0 is the wrong or right domain name is specified for constant function calls of 'delaytaps4stage' in module 'mux_byidx'.
You can try it again on this project to see if the issue can be reproduced again. And you can try to set 'log2of_multimux_routcount' to a greator value to see if the greator memory usage is found.