Bash – Write the same file multiple times to one file using dd

bashddshell-script

I am trying to create large dummy files on a drive using dd. I am currently doing this:

#!/bin/bash
writeFile(){ #$1 - destination directory/filename, $2 - source filepath $3 - blocksize, $4 - blockcount $5 - log file name

if [ "$#" -ne 5 ]; then
    echo "Bad number of args - Should be 4, not $#"
    return 1;
fi

dest_filepath=$1
src_filepath=$2
block_size=$3
block_count=$4
log_file=$5

int_regex='^[0-9]+$' 

file_size=$(($block_size * $block_count))
src_file_size=`ls -l $src_filepath | awk '{print $5}'`
full_iter=0
while [[ $file_size -ge $src_file_size ]]; do
    file_size=$((file_size - $src_file_size))
    full_iter=$((full_iter + 1))
done

section_block_count=$(($src_file_size / $block_size))
echo $section_block_count $block_size
topping_off_block_count=$(($file_size / $block_size))

dest_dir=$(dirname $dest_filepath)
if [ -d "$dest_dir" ] && [ -r $src_filepath ] && [[ $block_size =~ $int_regex ]] && [[ $block_count =~ $int_regex ]]; then
    data_written=0
    for (( i=0 ; i < $full_iter ; i=$((i+1)) )); do
        (time dd of=$dest_filepath if=$src_filepath bs=$block_size count=$section_block_count seek=$data_written) >> $log_file 2>&1 #Output going to external file
        data_written=$(($data_written + $src_file_size +1 ))
        echo $data_written
    done

    if [[ $file_size -gt 0 ]]; then
        (time dd of=$dest_filepath if=$src_filepath bs=$block_size count=$topping_off_block_count seek=$data_written) >> $log_file 2>&1 & #Output going to external file
    fi
    return 0;
fi

return 1;   
}

However, this isn't working, as it's either only writing from the src_filepath once, or writing over the same part of the file multiple times, I don't know how to find out the difference. In this particular case, what I'm doing is writing from a 256MB file 4 times to create a single 1GB file, but I want to keep it generic so that I can write any size from and to.

The aim is to fragment a hard drive, and measure the output of dd (rate of transfer specifically) and the time it took.

I am on an embedded system with limited functionality, and the OS is a very but down version of linux using busybox.

How do I alter this so that it will write the correct size file?

Best Answer

replying to comments: conv=notrunc makes dd not truncate, but doesn't make it seek to the end. (It leaves out O_TRUNC, but doesn't add O_APPEND in the open(2) system call).

Answering the question: If you insist on using dd instead of cat, then get the shell to open the output file for append, and have dd write to its stdout.

dd if=src bs=128k count=$count of=/dev/stdout >> dest 2>> log

Also, if you're trying to fragment your drive, you could do a bunch of fallocate(1) allocations to use space, and then start using dd once the drive is near full. util-linux's fallocate program is a simple front-end to the fallocate(2) system call.

xfs for example will detect the open, append pattern and leave its speculatively-preallocated space beyond EOF allocated for a few seconds after closing. So on XFS, a loop of appending to the same file repeatedly won't produce as much fragmentation as writing many small files.

You're on an embedded system, so I assume you're not using xfs. In that case, you still might see less fragmentation from your close/reopen/write-more that you'd expect, with a decently smart filesystem. Maybe sync between each write, to wait for the FS to allocate and write out all your data, before letting it know there's more coming.

Related Question