...
Run Format

Text file src/simd/archsimd/_gen/simdgen/ops/BitwiseLogic/go.yaml

Documentation: simd/archsimd/_gen/simdgen/ops/BitwiseLogic

     1!sum
     2# In the XED data, *all* floating point bitwise logic operation has their
     3# operand type marked as uint. We are not trying to understand why Intel
     4# decided that they want FP bit-wise logic operations, but this irregularity
     5# has to be dealed with in separate rules with some overwrites.
     6
     7# For many bit-wise operations, we have the following non-orthogonal
     8# choices:
     9#
    10# - Non-masked AVX operations have no element width (because it
    11# doesn't matter), but only cover 128 and 256 bit vectors.
    12#
    13# - Masked AVX-512 operations have an element width (because it needs
    14# to know how to interpret the mask), and cover 128, 256, and 512 bit
    15# vectors. These only cover 32- and 64-bit element widths.
    16#
    17# - Non-masked AVX-512 operations still have an element width (because
    18# they're just the masked operations with an implicit K0 mask) but it
    19# doesn't matter! This is the only option for non-masked 512 bit
    20# operations, and we can pick any of the element widths.
    21#
    22# We unify with ALL of these operations and the compiler generator
    23# picks when there are multiple options.
    24
    25# TODO: We don't currently generate unmasked bit-wise operations on 512 bit
    26# vectors of 8- or 16-bit elements. AVX-512 only has *masked* bit-wise
    27# operations for 32- and 64-bit elements; while the element width doesn't matter
    28# for unmasked operations, right now we don't realize that we can just use the
    29# 32- or 64-bit version for the unmasked form. Maybe in the XED decoder we
    30# should recognize bit-wise operations when generating unmasked versions and
    31# omit the element width.
    32
    33# For binary operations, we constrain their two inputs and one output to the
    34# same Go type using a variable.
    35
    36- go: And
    37  asm: "VPAND[DQ]?"
    38  in:
    39  - &any
    40    go: $t
    41  - *any
    42  out:
    43  - *any
    44
    45- go: And
    46  asm: "VPANDD" # Fill in the gap, And is missing for Uint8x64 and Int8x64
    47  inVariant: []
    48  in: &twoI8x64
    49  - &i8x64
    50    go: $t
    51    overwriteElementBits: 8
    52  - *i8x64
    53  out: &oneI8x64
    54  - *i8x64
    55
    56- go: And
    57  asm: "VPANDD" # Fill in the gap, And is missing for Uint16x32 and Int16x32
    58  inVariant: []
    59  in: &twoI16x32
    60  - &i16x32
    61    go: $t
    62    overwriteElementBits: 16
    63  - *i16x32
    64  out: &oneI16x32
    65  - *i16x32
    66
    67- go: AndNot
    68  asm: "VPANDN[DQ]?"
    69  operandOrder: "21" # switch the arg order
    70  in:
    71  - *any
    72  - *any
    73  out:
    74  - *any
    75
    76- go: AndNot
    77  asm: "VPANDND" # Fill in the gap, AndNot is missing for Uint8x64 and Int8x64
    78  operandOrder: "21" # switch the arg order
    79  inVariant: []
    80  in: *twoI8x64
    81  out: *oneI8x64
    82
    83- go: AndNot
    84  asm: "VPANDND" # Fill in the gap, AndNot is missing for Uint16x32 and Int16x32
    85  operandOrder: "21" # switch the arg order
    86  inVariant: []
    87  in: *twoI16x32
    88  out: *oneI16x32
    89
    90- go: Or
    91  asm: "VPOR[DQ]?"
    92  in:
    93  - *any
    94  - *any
    95  out:
    96  - *any
    97
    98- go: Or
    99  asm: "VPORD" # Fill in the gap, Or is missing for Uint8x64 and Int8x64
   100  inVariant: []
   101  in: *twoI8x64
   102  out: *oneI8x64
   103
   104- go: Or
   105  asm: "VPORD" # Fill in the gap, Or is missing for Uint16x32 and Int16x32
   106  inVariant: []
   107  in: *twoI16x32
   108  out: *oneI16x32
   109
   110- go: Xor
   111  asm: "VPXOR[DQ]?"
   112  in:
   113  - *any
   114  - *any
   115  out:
   116  - *any
   117
   118- go: Xor
   119  asm: "VPXORD" # Fill in the gap, Or is missing for Uint8x64 and Int8x64
   120  inVariant: []
   121  in: *twoI8x64
   122  out: *oneI8x64
   123
   124- go: Xor
   125  asm: "VPXORD" # Fill in the gap, Or is missing for Uint16x32 and Int16x32
   126  inVariant: []
   127  in: *twoI16x32
   128  out: *oneI16x32
   129
   130- go: tern
   131  asm: "VPTERNLOGD|VPTERNLOGQ"
   132  in:
   133  - &tern_op
   134    go: $t
   135  - *tern_op
   136  - *tern_op
   137  - class: immediate
   138    immOffset: 0
   139    name: table
   140  inVariant: []
   141  out:
   142  - *tern_op

View as plain text