laitimes

C instance_Get the MD5 value of the file

author:DS Xiaolongge

1. MD5 introduction

Message Digest Algorithm 5 (MD5) is a commonly used hash function algorithm. Data of any length is taken as input and a unique, fixed-length (typically 128-bit) hash is generated, known as an MD5 value. The MD5 algorithm is known for its high reliability and wide range of applications.

C instance_Get the MD5 value of the file

The MD5 algorithm has the following characteristics:

(1) Irreversibility: The raw data cannot be obtained by inverse operation for a given MD5 value.

(2) Uniqueness: Different input data will generate different MD5 values.

(3) Efficiency: For a given data, it is very fast to calculate its MD5 value.

Scenarios for MD5 values include:

(1) Data integrity verification: The MD5 value can be used to verify that the file has not been tampered with during transmission. The sender calculates the MD5 value of the file and sends it to the receiver, the receiver recalculates the MD5 value after receiving the file, and then compares it with the MD5 value of the sender, if it is consistent, the file has not been tampered with.

(2) Password storage: In many systems, user passwords are usually not stored in plaintext, but are converted to MD5 values and stored. When a user logs in, the system converts the password entered by the user to an MD5 value and then compares it to the stored MD5 value to verify the correctness of the password.

(3) Security authentication: The MD5 value can also be used in security authentication such as digital certificates to verify the integrity of documents and the authenticity of authentication information.

(4) Data fingerprinting: The MD5 value can be used as a unique identifier for data to quickly compare and find duplicate data.

2. Sample code

2.1 Obtaining MD5 Values (OpenSSL Library)

Obtaining the MD5 value of a piece of data in C can be implemented using existing third-party libraries. Here's an example code that uses the OpenSSL library to calculate the MD5 value of the data:

(1) The OpenSSL library needs to be installed (if it is not already installed) and include the relevant header files:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/md5.h>           

(2) Create a subfunction to calculate the MD5 value of the data:

void calculate_md5(const unsigned char* data, size_t length, unsigned char* md5_hash) {
    MD5_CTX ctx;
    MD5_Init(&ctx);
    MD5_Update(&ctx, data, length);
    MD5_Final(md5_hash, &ctx);
}           

The function accepts three parameters: data is the data pointer to be evaluated, length is the data length, and md5_hash is an array that stores MD5 values.

Here's a complete program that shows how to call the above subfunctions and print the MD5 values:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/md5.h>

void calculate_md5(const unsigned char* data, size_t length, unsigned char* md5_hash) {
    MD5_CTX ctx;
    MD5_Init(&ctx);
    MD5_Update(&ctx, data, length);
    MD5_Final(md5_hash, &ctx);
}

void print_md5(const unsigned char* md5_hash) {
    for (int i = 0; i < MD5_DIGEST_LENGTH; i++) {
        printf("%02x", md5_hash[i]);
    }
    printf("\n");
}

int main() {
    const unsigned char data[] = "Hello, World!";
    size_t length = sizeof(data) - 1; // 减去字符串末尾的空字符
    unsigned char md5_hash[MD5_DIGEST_LENGTH];

    calculate_md5(data, length, md5_hash);
    printf("MD5: ");
    print_md5(md5_hash);

    return 0;
}           

This sample program will output the MD5 value of a piece of data. You can store the data to be computed in an array of data and adjust the length of the data as needed.

This is the MD5 function provided by OpenSSL. At compile time, you need to link the OpenSSL library. On Linux systems, you can use the -lssl -lcrypto argument for linking. On Windows, you need to download and install the OpenSSL library, and configure the correct link path and library file name.

2.2 Get the MD5 value of a file (openssl library)

Here's an example code that uses the OpenSSL library to calculate the MD5 value of a file:

(1) The OpenSSL library needs to be installed (if it is not already installed) and include the relevant header files:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/md5.h>           

(2) Create a subfunction to calculate the MD5 value of the file:

void calculate_file_md5(const char* filename, unsigned char* md5_hash) {
    FILE* file = fopen(filename, "rb");
    if (file == NULL) {
        printf("Failed to open file: %s\n", filename);
        return;
    }

    MD5_CTX ctx;
    MD5_Init(&ctx);

    unsigned char buffer[1024];
    size_t read;
    while ((read = fread(buffer, 1, sizeof(buffer), file)) != 0) {
        MD5_Update(&ctx, buffer, read);
    }

    fclose(file);

    MD5_Final(md5_hash, &ctx);
}           

The function accepts two parameters: filename is the name of the file to be calculated, and md5_hash is an array of MD5 values to be stored.

Here's a complete example of how to call the above subfunction and print the MD5 value of the file:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/md5.h>

void calculate_file_md5(const char* filename, unsigned char* md5_hash) {
    // ... 函数实现见上文 ...

void print_md5(const unsigned char* md5_hash) {
    for (int i = 0; i < MD5_DIGEST_LENGTH; i++) {
        printf("%02x", md5_hash[i]);
    }
    printf("\n");
}

int main() {
    const char* filename = "path/to/file";
    unsigned char md5_hash[MD5_DIGEST_LENGTH];

    calculate_file_md5(filename, md5_hash);
    printf("MD5: ");
    print_md5(md5_hash);

    return 0;
}           

This example program will open the specified file and calculate its MD5 value. You'll need to store the file path in the filename string and adjust the string as needed.

Please use the MD5 functions provided by OpenSSL here. At compile time, you need to link the OpenSSL library. On Linux systems, you can use the -lssl -lcrypto argument for linking. On Windows, you need to download and install the OpenSSL library, and configure the correct link path and library file name.

2.3 Write the algorithm to get the MD5 value

The implementation of the MD5 algorithm is complex, involving bit operation, logic operation, displacement, etc.

The following is a simplified version of the pure C MD5 algorithm implementation:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef unsigned char uint8;
typedef unsigned int uint32;

// MD5常量定义
const uint32 MD5_CONSTANTS[] = {
    0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee,
    0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501,
    0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be,
    0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821,
    0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa,
    0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8,
    0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed,
    0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a,
    0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c,
    0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70,
    0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05,
    0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665,
    0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039,
    0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1,
    0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1,
    0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391
};

// 循环左移
#define LEFT_ROTATE(x, n) (((x) << (n)) | ((x) >> (32 - (n))))

// 转换为大端字节序
void to_big_endian(uint32 value, uint8* buffer) {
    buffer[0] = (uint8)(value & 0xff);
    buffer[1] = (uint8)((value >> 8) & 0xff);
    buffer[2] = (uint8)((value >> 16) & 0xff);
    buffer[3] = (uint8)((value >> 24) & 0xff);
}

// 处理消息块
void process_block(const uint8* block, uint32* state) {
    uint32 a = state[0];
    uint32 b = state[1];
    uint32 c = state[2];
    uint32 d = state[3];
    uint32 m[16];

    // 将消息块划分为16个32位字,并进行字节序转换
    for (int i = 0; i < 16; i++) {
        m[i] = (((uint32)block[i * 4 + 0]) << 0) |
               (((uint32)block[i * 4 + 1]) << 8) |
               (((uint32)block[i * 4 + 2]) << 16) |
               (((uint32)block[i * 4 + 3]) << 24);
    }

    // MD5循环运算
    for (int i = 0; i < 64; i++) {
        uint32 f, g;

        if (i < 16) {
            f = (b & c) | ((~b) & d);
            g = i;
        } else if (i < 32) {
            f = (d & b) | ((~d) & c);
            g = (5 * i + 1) % 16;
        } else if (i < 48) {
            f = b ^ c ^ d;
            g = (3 * i + 5) % 16;
        } else {
            f = c ^ (b | (~d));
            g = (7 * i) % 16;
        }

        uint32 temp = d;
        d = c;
        c = b;
        b = b + LEFT_ROTATE((a + f + MD5_CONSTANTS[i] + m[g]), 7);
        a = temp;
    }

    // 更新状态
    state[0] += a;
    state[1] += b;
    state[2] += c;
    state[3] += d;
}

// 计算MD5值
void calculate_md5(const uint8* message, size_t length, uint8* digest) {
    // 初始化状态
    uint32 state[4] = { 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476 };

    // 填充消息
    size_t padded_length = ((length + 8) / 64 + 1) * 64;
    uint8* padded_message = (uint8*)calloc(padded_length, 1);
    memcpy(padded_message, message, length);
    padded_message[length] = 0x80;  // 添加一个1
    to_big_endian((uint32)(length * 8), padded_message + padded_length - 8);  // 添加长度(以位为单位)

    // 处理消息块
    for (size_t i = 0; i < padded_length; i += 64) {
        process_block(padded_message + i, state);
    }

    // 生成摘要
    for (int i = 0; i < 4; i++) {
        to_big_endian(state[i], digest + i * 4);
    }
    
    free(padded_message);
}

// 打印MD5值
void print_md5(const uint8* digest) {
    for (int i = 0; i < 16; i++) {
        printf("%02x", digest[i]);
    }
    printf("\n");
}

int main() {
    const char* message = "Hello, World!";
    size_t length = strlen(message);
    uint8 digest[16];

    calculate_md5((const uint8*)message, length, digest);
    printf("MD5: ");
    print_md5(digest);

    return 0;
}           

This app can calculate the MD5 value for a given string. Store the data to be computed in a message string and adjust the length of the data as needed.