YASE Encoder

Fork me on Github

You can find a complete version of the project that is described in this paper on my Github account.


YASE (Yet Another Sub Encoder) is a tiny but efficient Sub Encoder to quickly encode your shellcode payloads and thus bypass some common restricted character while exploiting buffer overflows.

This tool will brute force any possible combination to achieve expected result. The advantage of a such approach is to have an output result that always look different.

Optimization N°1

EAX register will be set to zero only at the beginning (using the two and operations)

and operation to initialize a register is very costy (10 bytes)

After the first initializaion, EAX value is saved to EBX. During next sub decoding operations EAX is restored from EBX thus saving a huge amount of space (2 bytes instead of 10).

Optimization N°2 - TODO

  • Optimize sub encoding. (Minimize sub operations when possible)

Support Multiple Output Formats

  • Commented Python Variable
  • Python Variable
  • C Variable
  • Hex Formated String
  • Hex String
  • Raw




    Jean-Pierre LESUEUR (@DarkCoderSc)

    YASE Encoder : Yet Another Sub-Encoder

    License: MIT

    Category: OSCE Preparation  

        I'm using bruteforcing in order generate the subs with valid characters.

        Valid characters are shuffled in order to output different looking subs each time you use the script
        even for exact same instructions.

        Be sure to understand what sub encoding is before using any automated scripts otherwise you will miss 
        something really cool to learn :)

        - Optimize required amount of sub to limit size of output payload (actually always three subs)

from textwrap import wrap
import numpy as np
import argparse
import random
import struct
import sys

_asm_push_eax = "\\x50"
_asm_push_ebx = "\\x53"
_asm_pop_eax = "\\x58"
_asm_pop_ebx = "\\x5b"
_asm_and = "\\x25"
_asm_sub = "\\x2d"

raw_output = False
verbose = True

class tcolors:
    clear = "\033[0m"
    green = "\033[32m"
    red = "\033[31m"
    yellow = "\033[33m"
    blue = "\033[34m"
    gray = "\033[90m"

def success(message):
    if raw_output or not argv.verbose:

    print(f"[\033[32m✓\033[39m] {message}")

def error(message):
    if raw_output or not argv.verbose:


def debug(message):
    if raw_output or not argv.verbose:

    print(f"[\033[34m*\033[39m] {message}") 

def title(title):
    if raw_output or not argv.verbose:

    print("\n" + ("=" * 45))
    print(f" {title}")
    print("=" * 45)

def bytearr_to_bytestr(data):
    return ''.join(f"\\x{'{:02x}'.format(x)}" for x in data)

def bytestr_to_bytearr(data):
    return list(bytearray.fromhex(data.replace("\\x", " ")))

    Prepare our goodchar array. To vary the output subencoded shellcode, we generate
    goodchar list randomly. We can do this since we are bruteforcing correct sub combination.

    The goodchar is composed of alpha numeric characters from 0x01 to 0x7f.

    Notice: It is possible to filter even more badchars within the range of 0x01-0x7f,
            use remove(character_to_remove).
goodchars = [] # to be generated

def generate_goodchars():
    global goodchars

    goodchars = list(range(1, 128)) # 0x01 --> 0x7f


    # goodchars.remove(0x2) # Uncommend if you need to filter extra badchars

    # ...


    Contains required three sub operation for one byte.

    If carry is set to True, next byte must be decreased by 1.

class ByteSubCombination:
    def __init__(self, first, second, third, carry):
        self.first = first
        self.second = second
        self.third = third
        self.carry = carry


    Encoder Object

class Encoder:

    def __init__(self, instruction_prefix, assembly_prefix):
        self.instruction_prefix = instruction_prefix
        self.assembly_prefix = assembly_prefix

    def get_row_value(self, row, encoded=False):
        value = ""
        for col in row:
            value += "{:02x}".format(col)

        if encoded:
            value = bytearr_to_bytestr(struct.pack(' 1:
                                first -= 1
                            elif second > 1:
                                second -= 1
                                third -= 1  

                        return ByteSubCombination(first, second, third, carry)

        return None

    def generate(self):         
        result = None
        for index, value in enumerate(self.complement.to_bytes(4, "little")):
            result = self.sub_encode(value, result)

            if result:
                self.matrice[0][3 - index] = result.first
                self.matrice[1][3 - index] = result.second
                self.matrice[2][3 - index] = result.third   
                return False

        return True 


    Entry Point

if __name__ == "__main__":

    # Parse Arguments
    supported_output_format = [

    argument_parser = argparse.ArgumentParser(description=f"Yet another Shellcode Sub Encoder by {tcolors.blue}@DarkCoderSc{tcolors.clear}")

    argument_parser.add_argument('-s', '--shellcode', type=str, dest="shellcode", action="store", required=True,
                                     help=f"Shellcode in C style format. Place payload between double quotes. Example: \"{tcolors.blue}\\x01\\x02\\x03\\x04\\x05\\x06...\\x0a{tcolors.clear}\"")

    argument_parser.add_argument('-n', '--name', type=str, dest="python_var_name", default="encoded_shellcode", action="store", required=False, 
                                    help="Define the variable name when exporting encoded shellcode as python or c formated variable.")

    argument_parser.add_argument('-f', '--format', type=str, dest="output_format", default="python_commented", action="store", required=False, 
                                    help=f"Define output format of encoded shellcode. (\"{', '.join(supported_output_format)}\")")  

    argument_parser.add_argument('-v', '--verbose', dest="verbose", default=False, action="store_true", required=False, help="Enable verbosity (always disabled on output raw format).")

        argv = argument_parser.parse_args()     
    except IOError:

    # Translate payload from python byte array string to byte array

        bytes_to_encode = bytestr_to_bytearr(argv.shellcode)
        error("Malformed instructions. Please check help for expected format.")


    # Check output format value
    argv.output_format = argv.output_format.lower()
    if not argv.output_format in supported_output_format:
        error(f"Invalid output format value. (Supported: \"{', '.join(supported_output_format)}\")")


    raw_output = (argv.output_format == supported_output_format[5]) # if we are using output raw, we don't verbose anything else than encoded shellcode to stdout

    success(f"Payload successfully loaded=[{argv.shellcode}]")

    # Align payload (requires to be a multiple of 4)

    align_delta = (len(bytes_to_encode) % 4)
    if (align_delta > 0):
        debug("Payload requires alignement of {} bytes".format((4 - align_delta)))

        for i in range((4 - align_delta)):
            bytes_to_encode.append(0x90) # align with extra NOP's
        debug("No alignement needed.")  

    # Split payload in chunk of 4 bytes

    encoded_instruction = []    
    first = True
    for i in reversed(range(0, len(bytes_to_encode), 4)):       
        instruction_chunk = bytes(bytes_to_encode[i:(i+4)])
            debug(f"Attempt to sub encode instruction=[{tcolors.blue}0x{instruction_chunk.hex()}{tcolors.clear}]...")

            instruction = EncodeInstruction(instruction_chunk, not first)
            first = False

            success(f"Candidate successfully sub encoded.") 

            debug(f"Verify if sub encoding instructions result in original instruction...") 

            if instruction.validate():
                success("Verification succeed!")
                raise Exception("Verification failed")

        except Exception as e:
            error(f"FATAL: Could not sub encode instructions. Instruction=[{instruction_chunk}] Msg=[{str(e)}]")

    if len(encoded_instruction) > 0:
        opcode = ""
        commented_python = ""
        for instruction in encoded_instruction:         
            commented_python += instruction.to_commented_python() + "\n"
            opcode += instruction.to_raw()

        raw = opcode.replace("\\x", "")

        if not raw_output:      
            if argv.output_format == supported_output_format[0]:
                title("Commented Python")
            elif argv.output_format == supported_output_format[1]:
                output = f"{argv.python_var_name} = b\"\"\n"
                for chunk in wrap(opcode, 64):
                    output += f"{argv.python_var_name} += b\"{chunk}\"\n"

            elif argv.output_format == supported_output_format[2]:
                title("C / CPP")
                output = f"unsigned char {argv.python_var_name}[] = \\\n"
                for chunk in wrap(opcode, 64):
                    output += f"\t\"{chunk}\"\n"
                output = output[:-1] + ";"

            elif argv.output_format == supported_output_format[3]:
                title("String (C / Cpp / Python etc...)")
            elif argv.output_format == supported_output_format[4]:
                title("Hex string")

            old_len = len(bytes_to_encode)
            new_len = int(len(raw) / 2)


            debug(f"Original Payload Length: {tcolors.blue}{old_len}{tcolors.clear} bytes.")
            debug(f"Encoded Payload Length: {tcolors.blue}{new_len}{tcolors.clear} bytes (+{tcolors.blue}{int((new_len * 100) / old_len)}%{tcolors.clear}).")
            debug(f"Required stack space for decoding: {tcolors.blue}{len(bytes_to_encode)}{tcolors.clear} bytes.\n")

Written the Nov. 27, 2020, 12:14 p.m. by Jean-Pierre LESUEUR

Updated: ago.