diff mbox series

crypto: tcrypt - add script tcrypt_speed_compare.py

Message ID	202312101758+0800-wangjinchao@xfusion.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Date: Sun, 10 Dec 2023 18:19:41 +0800 From: WangJinchao <wangjinchao@xfusion.com> To: Herbert Xu <herbert@gondor.apana.org.au>, "David S. Miller" <davem@davemloft.net>, Steffen Klassert <steffen.klassert@secunet.com>, Daniel Jordan <daniel.m.jordan@oracle.com>, <linux-crypto@vger.kernel.org>, <linux-kernel@vger.kernel.org> CC: <stone.xulei@xfusion.com>, <wangjinchao@xfusion.com> Subject: [PATCH] crypto: tcrypt - add script tcrypt_speed_compare.py Message-ID: <202312101758+0800-wangjinchao@xfusion.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Precedence: bulk
Series	crypto: tcrypt - add script tcrypt_speed_compare.py \| crypto: tcrypt - add script tcrypt_speed_compare.py

Commit Message

Wang Jinchao Dec. 10, 2023, 10:19 a.m. UTC

  Create a script for comparing tcrypt speed test logs.
The script will systematically analyze differences item
by item and provide a summary (average).
This tool is useful for evaluating the stability of
cryptographic module algorithms and assisting with
performance optimization.

The script produces comparisons in two scenes:

1. For operations in seconds
===========================================================================
rfc4106(gcm(aes)) (pcrypt(rfc4106(gcm_base(ctr(aes-generic),ghash-generic))
                         encryption
---------------------------------------------------------------------------
bit key | byte blocks | base ops    | new ops     | differ(%)
160     | 16          | 60276       | 47081       | -21.89
160     | 64          | 55307       | 45430       | -17.86
160     | 256         | 53196       | 41391       | -22.19
160     | 512         | 45629       | 38511       | -15.6
160     | 1024        | 37489       | 44333       | 18.26
160     | 1420        | 32963       | 32815       | -0.45
160     | 4096        | 18416       | 18356       | -0.33
160     | 8192        | 11878       | 10701       | -9.91
224     | 16          | 55332       | 56620       | 2.33
224     | 64          | 59551       | 55006       | -7.63
224     | 256         | 53144       | 49892       | -6.12
224     | 512         | 46655       | 44010       | -5.67
224     | 1024        | 38379       | 35988       | -6.23
224     | 1420        | 33125       | 31529       | -4.82
224     | 4096        | 17750       | 17351       | -2.25
224     | 8192        | 10213       | 10046       | -1.64
288     | 16          | 64662       | 56571       | -12.51
288     | 64          | 57780       | 54815       | -5.13
288     | 256         | 54679       | 50110       | -8.36
288     | 512         | 46895       | 43201       | -7.88
288     | 1024        | 36286       | 35860       | -1.17
288     | 1420        | 31175       | 32327       | 3.7
288     | 4096        | 16686       | 16699       | 0.08
288     | 8192        | 9662        | 9548        | -1.18
---------------------------------------------------------------------------
average differ(%s)    | total_differ(%)
---------------------------------------------------------------------------
-5.60                 | 7.28
===========================================================================

2. For avg cycles of operation
===========================================================================
rfc4309(ccm(aes)) (rfc4309(ccm_base(ctr(aes-generic),cbcmac(aes-generic))))
                         encryption
---------------------------------------------------------------------------
bit key | byte blocks | base ops    | new ops     | differ(%)
152     | 16          | 792483      | 801555      | 1.14
152     | 64          | 552470      | 557953      | 0.99
152     | 256         | 254997      | 260518      | 2.17
152     | 512         | 148486      | 153241      | 3.2
152     | 1024        | 80925       | 83446       | 3.12
152     | 1420        | 59601       | 60999       | 2.35
152     | 4096        | 21714       | 22064       | 1.61
152     | 8192        | 10984       | 11301       | 2.89
---------------------------------------------------------------------------
average differ(%s)    | total_differ(%)
---------------------------------------------------------------------------
2.18                  | -1.53
===========================================================================

Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
---
 MAINTAINERS                                 |   5 +
 tools/crypto/tcrypt/tcrypt_speed_compare.py | 179 ++++++++++++++++++++
 2 files changed, 184 insertions(+)
 create mode 100755 tools/crypto/tcrypt/tcrypt_speed_compare.py

To: Daniel Jordan <daniel.m.jordan@oracle.com>
After spending a considerable amount of time analyzing and 
benchmarking, I've found that although my approach could simplify
the code logic, it leads to a performance decrease. Therefore,
I have decided to abandon the code optimization effort.
During the performance comparison, I utilized the Python script
from this commit and found it to be valuable.
Despite not optimizing padata, I would like to share this tool,
which is helpful for analyzing cryptographic performance, 
for the benefit of others.

Additionally, thank you for your assistance throughout this process.

To: Steffen Klassert <steffen.klassert@secunet.com>
Thank you for providing the testing method. Based on the approach
you suggested, I conducted performance comparisons for padata.
You were correct; the scheduling overhead is significant compared
to 'parallel()' calls. During profiling, approximately 80% of the
time was spent on operations related to 'queue_work_on' and locks.

Furthermore, I observed a substantial number of 'pcrypt(pcrypt(...'
structures during multiple 'modprobe' runs for pcrypt.
To address this, I adjusted the testing procedure by removing the
pcrypt module before each test, as indicated in the comments of this commit.

In summary, I appreciate your guidance. This serves as a conclusion
to my attempt at modifying padata, which I have decided to abandon.

Thank you

Comments

Tim Chen Dec. 12, 2023, 9:56 p.m. UTC | #1

On Sun, 2023-12-10 at 18:19 +0800, WangJinchao wrote:
> Create a script for comparing tcrypt speed test logs.
> The script will systematically analyze differences item
> by item and provide a summary (average).
> This tool is useful for evaluating the stability of
> cryptographic module algorithms and assisting with
> performance optimization.

I have found that for such comparison, the stability is
dependent on whether we allow the frequency to
float or we pin the frequency.  So in the past when
I use tcrypt, sometimes I have
to pin the frequency of CPU for stable results.

One suggestion I have is for for you to also dump the
frequency governor and P-state info so we know
for the runs being compared, whether they are running
with the same CPU frequency.

Tim 

> 
> The script produces comparisons in two scenes:
> 
> 1. For operations in seconds
> ===========================================================================
> rfc4106(gcm(aes)) (pcrypt(rfc4106(gcm_base(ctr(aes-generic),ghash-generic))
>                          encryption
> ---------------------------------------------------------------------------
> bit key | byte blocks | base ops    | new ops     | differ(%)
> 160     | 16          | 60276       | 47081       | -21.89
> 160     | 64          | 55307       | 45430       | -17.86
> 160     | 256         | 53196       | 41391       | -22.19
> 160     | 512         | 45629       | 38511       | -15.6
> 160     | 1024        | 37489       | 44333       | 18.26
> 160     | 1420        | 32963       | 32815       | -0.45
> 160     | 4096        | 18416       | 18356       | -0.33
> 160     | 8192        | 11878       | 10701       | -9.91
> 224     | 16          | 55332       | 56620       | 2.33
> 224     | 64          | 59551       | 55006       | -7.63
> 224     | 256         | 53144       | 49892       | -6.12
> 224     | 512         | 46655       | 44010       | -5.67
> 224     | 1024        | 38379       | 35988       | -6.23
> 224     | 1420        | 33125       | 31529       | -4.82
> 224     | 4096        | 17750       | 17351       | -2.25
> 224     | 8192        | 10213       | 10046       | -1.64
> 288     | 16          | 64662       | 56571       | -12.51
> 288     | 64          | 57780       | 54815       | -5.13
> 288     | 256         | 54679       | 50110       | -8.36
> 288     | 512         | 46895       | 43201       | -7.88
> 288     | 1024        | 36286       | 35860       | -1.17
> 288     | 1420        | 31175       | 32327       | 3.7
> 288     | 4096        | 16686       | 16699       | 0.08
> 288     | 8192        | 9662        | 9548        | -1.18
> ---------------------------------------------------------------------------
> average differ(%s)    | total_differ(%)
> ---------------------------------------------------------------------------
> -5.60                 | 7.28
> ===========================================================================
> 
> 2. For avg cycles of operation
> ===========================================================================
> rfc4309(ccm(aes)) (rfc4309(ccm_base(ctr(aes-generic),cbcmac(aes-generic))))
>                          encryption
> ---------------------------------------------------------------------------
> bit key | byte blocks | base ops    | new ops     | differ(%)
> 152     | 16          | 792483      | 801555      | 1.14
> 152     | 64          | 552470      | 557953      | 0.99
> 152     | 256         | 254997      | 260518      | 2.17
> 152     | 512         | 148486      | 153241      | 3.2
> 152     | 1024        | 80925       | 83446       | 3.12
> 152     | 1420        | 59601       | 60999       | 2.35
> 152     | 4096        | 21714       | 22064       | 1.61
> 152     | 8192        | 10984       | 11301       | 2.89
> ---------------------------------------------------------------------------
> average differ(%s)    | total_differ(%)
> ---------------------------------------------------------------------------
> 2.18                  | -1.53
> ===========================================================================
> 
> Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
> ---
>  MAINTAINERS                                 |   5 +
>  tools/crypto/tcrypt/tcrypt_speed_compare.py | 179 ++++++++++++++++++++
>  2 files changed, 184 insertions(+)
>  create mode 100755 tools/crypto/tcrypt/tcrypt_speed_compare.py
> 
> To: Daniel Jordan <daniel.m.jordan@oracle.com>
> After spending a considerable amount of time analyzing and 
> benchmarking, I've found that although my approach could simplify
> the code logic, it leads to a performance decrease. Therefore,
> I have decided to abandon the code optimization effort.
> During the performance comparison, I utilized the Python script
> from this commit and found it to be valuable.
> Despite not optimizing padata, I would like to share this tool,
> which is helpful for analyzing cryptographic performance, 
> for the benefit of others.
> 
> Additionally, thank you for your assistance throughout this process.
> 
> To: Steffen Klassert <steffen.klassert@secunet.com>
> Thank you for providing the testing method. Based on the approach
> you suggested, I conducted performance comparisons for padata.
> You were correct; the scheduling overhead is significant compared
> to 'parallel()' calls. During profiling, approximately 80% of the
> time was spent on operations related to 'queue_work_on' and locks.
> 
> Furthermore, I observed a substantial number of 'pcrypt(pcrypt(...'
> structures during multiple 'modprobe' runs for pcrypt.
> To address this, I adjusted the testing procedure by removing the
> pcrypt module before each test, as indicated in the comments of this commit.
> 
> In summary, I appreciate your guidance. This serves as a conclusion
> to my attempt at modifying padata, which I have decided to abandon.
> 
> Thank you
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a0fb0df07b43..5690ab99f107 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5535,6 +5535,11 @@ F:	include/crypto/
>  F:	include/linux/crypto*
>  F:	lib/crypto/
>  
> +CRYPTO SPEED TEST COMPARE
> +M:	WangJinchao <wangjinchao@xfusion.com>
> +L:	linux-crypto@vger.kernel.org
> +F:	tools/crypto/tcrypt/tcrypt_speed_compare.py
> +
>  CRYPTOGRAPHIC RANDOM NUMBER GENERATOR
>  M:	Neil Horman <nhorman@tuxdriver.com>
>  L:	linux-crypto@vger.kernel.org
> diff --git a/tools/crypto/tcrypt/tcrypt_speed_compare.py b/tools/crypto/tcrypt/tcrypt_speed_compare.py
> new file mode 100755
> index 000000000000..789d24013d8e
> --- /dev/null
> +++ b/tools/crypto/tcrypt/tcrypt_speed_compare.py
> @@ -0,0 +1,179 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) xFusion Digital Technologies Co., Ltd., 2023
> +#
> +# Author: WangJinchao <wangjinchao@xfusion.com>
> +#
> +"""
> +A tool for comparing tcrypt speed test logs.
> +
> +Both support test for operations in one second and cycles of operation
> +for example, use it in bellow bash script
> +
> +```bash
> +#!/bin/bash
> +
> +seq_num=1
> +sec=1
> +num_mb=8
> +mode=211
> +
> +# base speed test
> +lsmod | grep pcrypt && modprobe -r pcrypt
> +dmesg -C
> +modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3
> +modprobe tcrypt mode=${mode} sec=${sec} num_mb=${num_mb}
> +dmesg > ${seq_num}_base_dmesg.log
> +
> +# new speed test
> +lsmod | grep pcrypt && modprobe -r pcrypt
> +dmesg -C
> +modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3
> +modprobe tcrypt mode=${mode} sec=${sec} num_mb=${num_mb}
> +dmesg > ${seq_num}_new_dmesg.log
> +lsmod | grep pcrypt && modprobe -r pcrypt
> +
> +./tcrypt_speed_compare.py ${seq_num}_base_dmesg.log ${seq_num}_new_dmesg.log  >${seq_num}_compare.log
> +grep 'average' -A2 -B0 --group-separator="" ${seq_num}_compare.log
> +```
> +"""
> +
> +import sys
> +import re
> +
> +
> +def parse_title(line):
> +    pattern = r'tcrypt: testing speed of (.*?) (encryption|decryption)'
> +    match = re.search(pattern, line)
> +    if match:
> +        alg = match.group(1)
> +        op = match.group(2)
> +        return alg, op
> +    else:
> +        return "", ""
> +
> +
> +def parse_item(line):
> +    pattern_operations = r'\((\d+) bit key, (\d+) byte blocks\): (\d+) operations'
> +    pattern_cycles = r'\((\d+) bit key, (\d+) byte blocks\): 1 operation in (\d+) cycles'
> +    match = re.search(pattern_operations, line)
> +    if match:
> +        res = {
> +            "bit_key": int(match.group(1)),
> +            "byte_blocks": int(match.group(2)),
> +            "operations": int(match.group(3)),
> +        }
> +        return res
> +
> +    match = re.search(pattern_cycles, line)
> +    if match:
> +        res = {
> +            "bit_key": int(match.group(1)),
> +            "byte_blocks": int(match.group(2)),
> +            "cycles": int(match.group(3)),
> +        }
> +        return res
> +
> +    return None
> +
> +
> +def parse(filepath):
> +    result = {}
> +    alg, op = "", ""
> +    with open(filepath, 'r') as file:
> +        for line in file:
> +            if not line:
> +                continue
> +            _alg, _op = parse_title(line)
> +            if _alg:
> +                alg, op = _alg, _op
> +                if alg not in result:
> +                    result[alg] = {}
> +                if op not in result[alg]:
> +                    result[alg][op] = []
> +                continue
> +            parsed_result = parse_item(line)
> +            if parsed_result:
> +                result[alg][op].append(parsed_result)
> +    return result
> +
> +
> +def merge(base, new):
> +    merged = {}
> +    for alg in base.keys():
> +        merged[alg] = {}
> +        for op in base[alg].keys():
> +            if op not in merged[alg]:
> +                merged[alg][op] = []
> +            for index in range(len(base[alg][op])):
> +                merged_item = {
> +                    "bit_key": base[alg][op][index]["bit_key"],
> +                    "byte_blocks": base[alg][op][index]["byte_blocks"],
> +                }
> +                if "operations" in base[alg][op][index].keys():
> +                    merged_item["base_ops"] = base[alg][op][index]["operations"]
> +                    merged_item["new_ops"] = new[alg][op][index]["operations"]
> +                else:
> +                    merged_item["base_cycles"] = base[alg][op][index]["cycles"]
> +                    merged_item["new_cycles"] = new[alg][op][index]["cycles"]
> +
> +                merged[alg][op].append(merged_item)
> +    return merged
> +
> +
> +def format(merged):
> +    for alg in merged.keys():
> +        for op in merged[alg].keys():
> +            base_sum = 0
> +            new_sum = 0
> +            differ_sum = 0
> +            differ_cnt = 0
> +            print()
> +            hlen = 80
> +            print("="*hlen)
> +            print(f"{alg}")
> +            print(f"{' '*(len(alg)//3) + op}")
> +            print("-"*hlen)
> +            key = ""
> +            if "base_ops" in merged[alg][op][0]:
> +                key = "ops"
> +                print(f"bit key | byte blocks | base ops    | new ops     | differ(%)")
> +            else:
> +                key = "cycles"
> +                print(f"bit key | byte blocks | base cycles | new cycles  | differ(%)")
> +            for index in range(len(merged[alg][op])):
> +                item = merged[alg][op][index]
> +                base_cnt = item[f"base_{key}"]
> +                new_cnt = item[f"new_{key}"]
> +                base_sum += base_cnt
> +                new_sum += new_cnt
> +                differ = round((new_cnt - base_cnt)*100/base_cnt, 2)
> +                differ_sum += differ
> +                differ_cnt += 1
> +                bit_key = item["bit_key"]
> +                byte_blocks = item["byte_blocks"]
> +                print(
> +                    f"{bit_key:<7} | {byte_blocks:<11} | {base_cnt:<11} | {new_cnt:<11} | {differ:<8}")
> +            average_speed_up = "{:.2f}".format(differ_sum/differ_cnt)
> +            ops_total_speed_up = "{:.2f}".format(
> +                (base_sum - new_sum) * 100 / base_sum)
> +            print('-'*hlen)
> +            print(f"average differ(%s)    | total_differ(%)")
> +            print('-'*hlen)
> +            print(f"{average_speed_up:<21} | {ops_total_speed_up:<10}")
> +            print('='*hlen)
> +
> +
> +def main(base_log, new_log):
> +    base = parse(base_log)
> +    new = parse(new_log)
> +    merged = merge(base, new)
> +    format(merged)
> +
> +
> +if __name__ == "__main__":
> +    if len(sys.argv) != 3:
> +        print(f"usage: {sys.argv[0]} base_log new_log")
> +        exit(-1)
> +    main(sys.argv[1], sys.argv[2])

Wang Jinchao Dec. 13, 2023, 3:01 a.m. UTC | #2

On Tue, Dec 12, 2023 at 01:56:56PM -0800, Tim Chen wrote:
> On Sun, 2023-12-10 at 18:19 +0800, WangJinchao wrote:
> > Create a script for comparing tcrypt speed test logs.
> > The script will systematically analyze differences item
> > by item and provide a summary (average).
> > This tool is useful for evaluating the stability of
> > cryptographic module algorithms and assisting with
> > performance optimization.
> 
> I have found that for such comparison, the stability is
> dependent on whether we allow the frequency to
> float or we pin the frequency.  So in the past when
> I use tcrypt, sometimes I have
> to pin the frequency of CPU for stable results.
> 
> One suggestion I have is for for you to also dump the
> frequency governor and P-state info so we know
> for the runs being compared, whether they are running
> with the same CPU frequency.
> 
> Tim 
> 
Thank you for the feedback. This information is valuable for stability testing
and performance optimization.

However, I am uncertain about how to dump P-state information, or I believe that
the script is unable to do so. The reasons are as follows:

1. The primary purpose of this script is to compare tcrypt logs, and it is
executed after the completion of the tcrypt tests. Consequently, it cannot
dump P-state information during tcrypt's runtime.

2. In virtualized environments, there is no available information in the
`/sys/devices/system/cpu/cpufreq` directory pertaining to P-state details.

Am I correct in my understanding?
I am considering documenting your suggestion in the script's comments.
What are your thoughts?
>

Tim Chen Dec. 13, 2023, 11:03 p.m. UTC | #3

On Wed, 2023-12-13 at 11:01 +0800, Wang Jinchao wrote:
> On Tue, Dec 12, 2023 at 01:56:56PM -0800, Tim Chen wrote:
> > On Sun, 2023-12-10 at 18:19 +0800, WangJinchao wrote:
> > > Create a script for comparing tcrypt speed test logs.
> > > The script will systematically analyze differences item
> > > by item and provide a summary (average).
> > > This tool is useful for evaluating the stability of
> > > cryptographic module algorithms and assisting with
> > > performance optimization.
> > 
> > I have found that for such comparison, the stability is
> > dependent on whether we allow the frequency to
> > float or we pin the frequency.  So in the past when
> > I use tcrypt, sometimes I have
> > to pin the frequency of CPU for stable results.
> > 
> > One suggestion I have is for for you to also dump the
> > frequency governor and P-state info so we know
> > for the runs being compared, whether they are running
> > with the same CPU frequency.
> > 
> > Tim 
> > 
> Thank you for the feedback. This information is valuable for stability testing
> and performance optimization.
> 
> However, I am uncertain about how to dump P-state information, or I believe that
> the script is unable to do so. The reasons are as follows:
> 
> 1. The primary purpose of this script is to compare tcrypt logs, and it is
> executed after the completion of the tcrypt tests. Consequently, it cannot
> dump P-state information during tcrypt's runtime.
> 
> 2. In virtualized environments, there is no available information in the
> `/sys/devices/system/cpu/cpufreq` directory pertaining to P-state details.
> 
> Am I correct in my understanding?
> I am considering documenting your suggestion in the script's comments.
> What are your thoughts?
> > 
I think that will be fine. 

Thanks.

Tim

diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index a0fb0df07b43..5690ab99f107 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5535,6 +5535,11 @@  F:	include/crypto/
 F:	include/linux/crypto*
 F:	lib/crypto/
 
+CRYPTO SPEED TEST COMPARE
+M:	WangJinchao <wangjinchao@xfusion.com>
+L:	linux-crypto@vger.kernel.org
+F:	tools/crypto/tcrypt/tcrypt_speed_compare.py
+
 CRYPTOGRAPHIC RANDOM NUMBER GENERATOR
 M:	Neil Horman <nhorman@tuxdriver.com>
 L:	linux-crypto@vger.kernel.org
diff --git a/tools/crypto/tcrypt/tcrypt_speed_compare.py b/tools/crypto/tcrypt/tcrypt_speed_compare.py
new file mode 100755
index 000000000000..789d24013d8e
--- /dev/null
+++ b/tools/crypto/tcrypt/tcrypt_speed_compare.py
@@ -0,0 +1,179 @@ 
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) xFusion Digital Technologies Co., Ltd., 2023
+#
+# Author: WangJinchao <wangjinchao@xfusion.com>
+#
+"""
+A tool for comparing tcrypt speed test logs.
+
+Both support test for operations in one second and cycles of operation
+for example, use it in bellow bash script
+
+```bash
+#!/bin/bash
+
+seq_num=1
+sec=1
+num_mb=8
+mode=211
+
+# base speed test
+lsmod | grep pcrypt && modprobe -r pcrypt
+dmesg -C
+modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3
+modprobe tcrypt mode=${mode} sec=${sec} num_mb=${num_mb}
+dmesg > ${seq_num}_base_dmesg.log
+
+# new speed test
+lsmod | grep pcrypt && modprobe -r pcrypt
+dmesg -C
+modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3
+modprobe tcrypt mode=${mode} sec=${sec} num_mb=${num_mb}
+dmesg > ${seq_num}_new_dmesg.log
+lsmod | grep pcrypt && modprobe -r pcrypt
+
+./tcrypt_speed_compare.py ${seq_num}_base_dmesg.log ${seq_num}_new_dmesg.log  >${seq_num}_compare.log
+grep 'average' -A2 -B0 --group-separator="" ${seq_num}_compare.log
+```
+"""
+
+import sys
+import re
+
+
+def parse_title(line):
+    pattern = r'tcrypt: testing speed of (.*?) (encryption|decryption)'
+    match = re.search(pattern, line)
+    if match:
+        alg = match.group(1)
+        op = match.group(2)
+        return alg, op
+    else:
+        return "", ""
+
+
+def parse_item(line):
+    pattern_operations = r'\((\d+) bit key, (\d+) byte blocks\): (\d+) operations'
+    pattern_cycles = r'\((\d+) bit key, (\d+) byte blocks\): 1 operation in (\d+) cycles'
+    match = re.search(pattern_operations, line)
+    if match:
+        res = {
+            "bit_key": int(match.group(1)),
+            "byte_blocks": int(match.group(2)),
+            "operations": int(match.group(3)),
+        }
+        return res
+
+    match = re.search(pattern_cycles, line)
+    if match:
+        res = {
+            "bit_key": int(match.group(1)),
+            "byte_blocks": int(match.group(2)),
+            "cycles": int(match.group(3)),
+        }
+        return res
+
+    return None
+
+
+def parse(filepath):
+    result = {}
+    alg, op = "", ""
+    with open(filepath, 'r') as file:
+        for line in file:
+            if not line:
+                continue
+            _alg, _op = parse_title(line)
+            if _alg:
+                alg, op = _alg, _op
+                if alg not in result:
+                    result[alg] = {}
+                if op not in result[alg]:
+                    result[alg][op] = []
+                continue
+            parsed_result = parse_item(line)
+            if parsed_result:
+                result[alg][op].append(parsed_result)
+    return result
+
+
+def merge(base, new):
+    merged = {}
+    for alg in base.keys():
+        merged[alg] = {}
+        for op in base[alg].keys():
+            if op not in merged[alg]:
+                merged[alg][op] = []
+            for index in range(len(base[alg][op])):
+                merged_item = {
+                    "bit_key": base[alg][op][index]["bit_key"],
+                    "byte_blocks": base[alg][op][index]["byte_blocks"],
+                }
+                if "operations" in base[alg][op][index].keys():
+                    merged_item["base_ops"] = base[alg][op][index]["operations"]
+                    merged_item["new_ops"] = new[alg][op][index]["operations"]
+                else:
+                    merged_item["base_cycles"] = base[alg][op][index]["cycles"]
+                    merged_item["new_cycles"] = new[alg][op][index]["cycles"]
+
+                merged[alg][op].append(merged_item)
+    return merged
+
+
+def format(merged):
+    for alg in merged.keys():
+        for op in merged[alg].keys():
+            base_sum = 0
+            new_sum = 0
+            differ_sum = 0
+            differ_cnt = 0
+            print()
+            hlen = 80
+            print("="*hlen)
+            print(f"{alg}")
+            print(f"{' '*(len(alg)//3) + op}")
+            print("-"*hlen)
+            key = ""
+            if "base_ops" in merged[alg][op][0]:
+                key = "ops"
+                print(f"bit key | byte blocks | base ops    | new ops     | differ(%)")
+            else:
+                key = "cycles"
+                print(f"bit key | byte blocks | base cycles | new cycles  | differ(%)")
+            for index in range(len(merged[alg][op])):
+                item = merged[alg][op][index]
+                base_cnt = item[f"base_{key}"]
+                new_cnt = item[f"new_{key}"]
+                base_sum += base_cnt
+                new_sum += new_cnt
+                differ = round((new_cnt - base_cnt)*100/base_cnt, 2)
+                differ_sum += differ
+                differ_cnt += 1
+                bit_key = item["bit_key"]
+                byte_blocks = item["byte_blocks"]
+                print(
+                    f"{bit_key:<7} | {byte_blocks:<11} | {base_cnt:<11} | {new_cnt:<11} | {differ:<8}")
+            average_speed_up = "{:.2f}".format(differ_sum/differ_cnt)
+            ops_total_speed_up = "{:.2f}".format(
+                (base_sum - new_sum) * 100 / base_sum)
+            print('-'*hlen)
+            print(f"average differ(%s)    | total_differ(%)")
+            print('-'*hlen)
+            print(f"{average_speed_up:<21} | {ops_total_speed_up:<10}")
+            print('='*hlen)
+
+
+def main(base_log, new_log):
+    base = parse(base_log)
+    new = parse(new_log)
+    merged = merge(base, new)
+    format(merged)
+
+
+if __name__ == "__main__":
+    if len(sys.argv) != 3:
+        print(f"usage: {sys.argv[0]} base_log new_log")
+        exit(-1)
+    main(sys.argv[1], sys.argv[2])