Language…
17 users online: Batata Douce, buzz_lightzyear, Chambo, Dark Prince, ForthRightMC, Fozymandias, Gamet2004, Gorry, Green, LadiesMan217, MellowYouth, Mischievous Marc, Rauf, Raychu2021,  Ringo, SiameseTwins, sinseiga - Guests: 266 - Bots: 249
Users: 64,795 (2,376 active)
Latest user: mathew

Advanced Documentation and ASM Code Library

Lose yoshi - SA-1 Hybrid:
Code
	!dp = $0000
	!addr = $0000
	!sa1 = 0
	!gsu = 0

if read1($00FFD6) == $15
	sfxrom
	!dp = $6000
	!addr = !dp
	!gsu = 1
elseif read1($00FFD5) == $23
	sa1rom
	!dp = $3000
	!addr = $6000
	!sa1 = 1
endif

LoseYoshi:
	JSR .Sub
	RTL
.Sub
	PHX
	LDX $18DF|!addr
	LDA #$10
	STA.w !163E-1,x			;>$163D
	LDA #$03
	STA $1DFA|!addr
	LDA #$13
	STA $1DFC|!addr
	LDA #$02
	STA.b !C1-1,x			;>$C1
	STZ $187A|!addr
	STZ $0DC1|!addr
	LDA #$C0
	STA $7D
	STZ $7B
	LDY.w !157C-1,x			;>$157B
	LDA .RunAwaySpeed,y
	STA.b !B6-1,x			;>$B5
	STZ.w !1594-1,x			;>$1593
	STZ.w !151C-1,x			;>$151B
	STZ $18AE|!addr
	LDA #$30
	STA $1497|!addr
	PLX
	RTS

.RunAwaySpeed
	db $10,$F0

Give thanks to RPG hacker for working on Asar.
For some reason, Akaginite deleted some of his routines in his file bin, here they are (all of them are unsigned):
Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; 16bit * 16bit Multiplication (non-sa-1)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Argusment
; $00-$01 : Multiplicand
; $02-$03 : Multiplier
; Return values
; $04-$07 : Product
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

MathMul16_16:	REP #$20
		LDY $00
		STY $4202
		LDY $02
		STY $4203
		STZ $06
		LDY $03
		LDA $4216
		STY $4203
		STA $04
		LDA $05
		REP #$11
		ADC $4216
		LDY $01
		STY $4202
		SEP #$10
		CLC
		LDY $03
		ADC $4216
		STY $4203
		STA $05
		LDA $06
		CLC
		ADC $4216
		STA $06
		SEP #$20
		RTS


32bit*32bit = 64bit multiplication (note: hogs up the entire
scratch RAM bytes; all 16 bytes).
Code
;SA-1 check
	!dp = $0000
	!addr = $0000
	!sa1 = 0
	!gsu = 0

if read1($00FFD6) == $15
	sfxrom
	!dp = $6000
	!addr = !dp
	!gsu = 1
elseif read1($00FFD5) == $23
	sa1rom
	!dp = $3000
	!addr = $6000
	!sa1 = 1
endif

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Unsigned 32bit * 32bit Multiplication
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Argument
; $00-$03 : Multiplicand
; $04-$07 : Multiplier
; Return values
; $08-$0F : Product
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;GHB's note to self:
;$4202 = 1st Multiplicand
;$4203 = 2nd Multiplicand
;$4216 = Product
;During SA-1:
;$2251 = 1st Multiplicand
;$2253 = 2nd Multiplicand
;$2306 = Product

if !sa1 != 0
	!Reg4202 = $2251
	!Reg4203 = $2253
	!Reg4216 = $2306
else
	!Reg4202 = $4202
	!Reg4203 = $4203
	!Reg4216 = $4216
endif

MathMul32_32:
		if !sa1 != 0
			STZ $2250
			STZ $2252
		endif
		REP #$21
		LDY $00
		BNE +
		STZ $08
		STZ $0A
		STY $0C
		BRA ++
+		STY !Reg4202
		LDY $04
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		STZ $0A
		STZ $0C
		LDY $05
		LDA !Reg4216		;>This is always spitting out as 0.
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $08
		LDA $09
		ADC !Reg4216
		LDY $06
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $09
		LDA $0A
		ADC !Reg4216
		LDY $07
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0A
		LDA $0B
		ADC !Reg4216
		STA $0B
		
++		LDY $01
		BNE +
		STY $0D
		BRA ++
+		STY !Reg4202
		LDY $04
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		LDY #$00
		STY $0D
		LDA $09
		ADC !Reg4216
		LDY $05
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $09
		LDA $0A
		ADC !Reg4216
		LDY $06
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0A
		LDA $0B
		ADC !Reg4216
		LDY $07
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0B
		LDA $0C
		ADC !Reg4216
		STA $0C
		
++		LDY $02
		BNE +
		STY $0E
		BRA ++
+		STY !Reg4202
		LDY $04
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		LDY #$00
		STY $0E
		LDA $0A
		ADC !Reg4216
		LDY $05
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0A
		LDA $0B
		ADC !Reg4216
		LDY $06
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0B
		LDA $0C
		ADC !Reg4216
		LDY $07
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0C
		LDA $0D
		ADC !Reg4216
		STA $0D
		
++		LDY $03
		BNE +
		STY $0F
		BRA ++
+		STY !Reg4202
		LDY $04
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		LDY #$00
		STY $0F
		LDA $0B
		ADC !Reg4216
		LDY $05
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0B
		LDA $0C
		ADC !Reg4216
		LDY $06
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0C
		LDA $0D
		ADC !Reg4216
		LDY $07
		STY !Reg4203
		if !sa1 != 0
			STZ $2254	;>Multiplication actually happens when $2254 is written.
			NOP		;\Wait till multiplication is done
			BRA $00		;/
		endif
		
		STA $0D
		LDA $0E
		ADC !Reg4216
		STA $0E
++		SEP #$20
		RTL

Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; 16bit / 16bit Division
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Arguments
; $00-$01 : Dividend
; $02-$03 : Divisor
; Return values
; $00-$01 : Quotient
; $02-$03 : Remainder
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

MathDiv:	REP #$20
		ASL $00
		LDY #$0F
		LDA.w #$0000
-		ROL A
		CMP $02
		BCC +
		SBC $02
+		ROL $00
		DEY
		BPL -
		STA $02
		SEP #$20
		RTS

Give thanks to RPG hacker for working on Asar.
As of the time of writing this post, asar V1.50 lacks a round to the nearest integer when you divide A by B. After learning about quotients and remainders (as well as fractions represented as 2 numbers and not as using a radix point (decimal point)), I've found a solution to this problem.

Round unsigned division A/B to an integer macro:
Code
macro UnsignRoundDivision(ValueA, ValueB)
	!Quotient		= <ValueA>/<ValueB>
	!Remainder		= <ValueA>%<ValueB>
	!HalfDivisorTrunc	= <ValueB>/2
	!HalfDivisor_Remainder	= <ValueB>%2

	;Get the halfpoint:
	!HalfDivisor		= !HalfDivisorTrunc ;>Defualt as round down
	if !HalfDivisor_Remainder == 1
	 !HalfDivisor		= !HalfDivisorTrunc+1 ;>increment 1/2 way point
	endif

	;Round upwards if remainder is bigger than halfpoint:
	!RoundedQuotient	= !Quotient
	if !Remainder >= !HalfDivisor
	 !RoundedQuotient	= !Quotient+1
	endif

	;!RoundedQuotient should now hold the rounded quotient
endmacro


Tested values (do this after defining a macro):
Code
;rounding test

!ValueATest	= 100
!ValueBTest	= 40

%UnsignRoundDivision(!ValueATest, !ValueBTest)

print "round testing ", dec(!ValueATest),"/",dec(!ValueBTest),"=",dec(!RoundedQuotient)

!ValueATest1	= 100
!ValueBTest1	= 41

%UnsignRoundDivision(!ValueATest1, !ValueBTest1)

print "round testing1 ", dec(!ValueATest1),"/",dec(!ValueBTest1),"=",dec(!RoundedQuotient)

Originally posted by Asar
Code
round testing 100/40=3
round testing1 100/41=2



Feel free to use this whatever you want (design for improvements and such). Also, this is my first time using a macro.
Give thanks to RPG hacker for working on Asar.
Asar does have a round function, called "round". You'll need to specify how many decimal places you want to keep though.
Code
!val #= round(3/2, 0)

For some actual code though: Here's some rounding functions from RPG Hacker's shared lib:
Code
; Returns the absolute value of value
; used by trunc
function abs(value) = select(less(num, 0), num*-1, num)

; Returns the truncated value of a decimal number
; Note that pure unsigned int conversion acts weirdly on most negative numbers here,
; which is why we actually need the select()
function trunc(decimal_number) = select(less(decimal_number, 0), -(abs(decimal_number)|0), decimal_number|0)

; Returns the fractional part of a decimal number
; This originally used (decimal_number%1) internally,
; but that ended up being ridiculously inaccurate in some cases.
; This version actually seems to work better in those cases,
; so I hope it's also the overall better solution.
function frac(decimal_number) = decimal_number-trunc(decimal_number)

; Returns the floor of a decimal number
function floor(decimal_number) = trunc(decimal_number)-select(less(decimal_number, trunc(decimal_number)), 1, 0)

; Returns the ceiling of a decimal number
function ceiling(decimal_number) = -floor(-decimal_number)

Remember to use "math round off" when using these, otherwise all decimal values are immediately truncated.
Originally posted by GreenHammerBro
For some reason, Akaginite deleted some of his routines in his file bin, here they are (all of them are unsigned):


because I hate you so much #lm{gfxby}

for forum moderators: Feel free to delete this shit post (*^_^*)
I'm starting to understand how this routine works:

Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; unsigned 16bit / 16bit Division
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Arguments
; $00-$01 : Dividend
; $02-$03 : Divisor
; Return values
; $00-$01 : Quotient
; $02-$03 : Remainder
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

MathDiv:	REP #$20
		ASL $00			;>leftshift dividend to allow divisor to compare with bit 15 (leftmost) divisor.
		LDY #$0F		;>Starts loop counter to check each of 16 bits of dividend.
		LDA.w #$0000		;>Start A with #$0000
-		ROL A			;>Leftrotate A (highest bits enters A's bit 0.), “bring down” digit also enters here
		CMP $02			;>if Divisor can go into A?
		BCC +			;>If not, get more digits out of dividend
		SBC $02			;>Otherwise subtract for remainder
+
;Here is where the quotient bits (came out of the carry)
;and enters $00 (as a quotient output from the right).
;Since binary is only 0s and 1s, multiplication isn't necessary
;as because this will either results 0 or the same number.

		ROL $00			;>LeftRotate dividend to obtain more digits (this “brings down” the digit into carry)
		DEY			;>decrement digit count
		BPL -			;>Loop until all 16 bits are done.
		STA $02			;>Store remainder
		SEP #$20
		RTL


Further explanation:


PSST, this video recently got posted that worked similar to this.
Give thanks to RPG hacker for working on Asar.
Alright, I have got someful functions which might have sone usage.
Code
function GetSpriteProperties(Y, X, PP, CCC, T) = ((Y&1)<<7)|((X&1)<<6)|((PP&3)<<4)|((CCC&7)<<1)|(T&1)
function GetTileProperties(Y, X, P, CCC, TT) = ((Y&1)<<7)|((X&1)<<6)|((P&1)<<5)|((CCC&7)<<2)|(TT&3)
function GetTilemapVRAM(X, Y, Layer) = Layer|((Y&$20)<<11)|((X&$20)<<10)|((Y&$1F)<<5)|(X&$1F)
function BigEndian(LittleEndian) = ((LittleEndian&$FF00)>>8)|((LittleEndian&$00FF)<<8)

Here is a short explaination what they do:
  • GetSpriteProperties and GetSpriteProperties: Transforms the input into a usable YXPPCCCT and YXPCCCTT format, respectively.
  • GetTilemapVRAM: Allows you to convert the X and Y coordinates from a tilemap together with the tilemap base address. Valid values for SMW hacking if not changing the tilemap addressing or using Lunar Magic as usual is $3000 for layer 1 in levels, $3800 for layer 2 in levels, $2000 for layer 1 in overworld, $3000 for layer 2 in overworld and $5000 for layer 3 in both, overworld and levels.
  • BigEndian: Swaps the low and high byte (could have called it "SwapLowHigh" but w/e, especially since afaik Asar converts word values into little endian anyway). Useful for the stripe image header which is in big endian instead of little endian the SNES is.
EDIT: Optimized and explained further more.
EDIT 2: grammar.

Extended (OAM $0200) Finish OAM routine

I guess this piece of code could be useful for people who uses the OAM X high byte in non-normal sprites (clusters, extended, etc.), or sprites that use OAM $0200 in general. It's based on SMW's finish OAM routine for normal sprites but I optimized it as much as possible.

It's designed to be called as an in-bank subroutine but you could adapt it to a PIXI/GIEPY subroutine (don't forget data bank changing). And be warned that it doesn't preserves Y since it's assumed to be used for one tile sprites.

Code
;Set size & X pos high byte for OAM $0200
;-Input:
; -A       = Size (#$00 = 8x8, #$01 (not #$02) = 16x16)"
; -Y       = Index to OAM"
; -$00     = "Real" tile X pos: $0200,y minus any offset added
; -$0C	   = Sprite X pos, 16 bit
; -$0200,y = Tile X pos"

ExtendedFinishOAM:
	sta $0F			;>  Store size for later usage.

	stz $03			;\
	lda $0200,y		; |
	sec : sbc $00		; |
	bpl +			; | Calculate tile's "displacement".
	dec $03			; |
+	sta $02			;/

	tya : lsr #2 : tay	;>  Prepare Y for indexing $0420.
	rep #$20		;>  Set 16 bit mode.
	lda $0C			;\
	clc : adc $02		; | Get sprite X pos + displacement - screen X pos.
	sec : sbc $1A		;/
	sep #$20		;>  Set 8 bit mode.
	xba			;\  Carry is set if A > #$00FF & < #$0200 (being more than #$0200 would never happen I think)
	lsr			;/  A.K.A. if sprite is offscreen horizontally.
	lda $0F			;\  Now we load the size bit...
	rol			; | ...combine it with carry's result...
	sta $0420,y		;/  ...and store it.

	rts			;>  Have fun with your sprite effects that won't teleport with the right screen boundary anymore.


Example of usage:

Code
	jsr GetDrawInfo	;> X pos in $01, Y pos in $02, index in Y.	

	rep #$20
	lda $01
	sta $0200,y
	sep #$20
	sta $00		;>  Store X pos in $00

	lda $64
	sta $0203,y

	lda #$1F	;\
	sta $0202,y	;/  Fun fact: bubble tile in my 8-bit like hack

	lda $171F,x	;\
	sta $0C		;/  Extended sprite X pos low byte.
	lda $1733,x	;\
	sta $0D		;/  Extended sprite X pos high byte.
	lda #$00	;>  Size is 8x8.
	jsr ExtendedFinishOAM


Userbar by Green Jerry

Also a Fortaleza Reznor user. If you... digo, si hablas español, hackeas, buscas ayuda, o simplemente se te da conocer gente, únete, somos puerta abierta.
Very optimised get item memory offset:
Code
	LDA $9B		; Load X position high byte
	STA $00		; $00 = XXXXXXXX
	LDA $99		; Load Y position high byte
	LSR		; Shift bit 0 into carry
	ROL $00		; Shift carry into bit 0: $00 = XXXXXXXY
	LDA $9A		; Load X position low byte
	ASL		; Shift bit 7 into carry
	ROL $00		; BAM! $00 = XXXXXXYx


Background information: The get item memory offset (that is the secondary index to get a collumn's item memory) on SMW is very unoptimised. Specifically, $00C00D (I call it SetItemMemory) uses large amounts of bit shifting (and uses as a result 27 lines of code) and $0DA5F4 (I call it GetItemMemory) checks for specific bits and uses lots of branches (and likewise, uses 15 lines of code as a result) but there is a way to compromise the code.
The idea is that the offset uses the following format: XXXXXXYx, where
  • XXXXXX is the X position high byte/ screen number (technically bits 0 to 5 but SMW only allows a sublevel to have up to 32 screens)
  • Y is the Y position high byte (technically bit 0 of Y high but blocks can only have either 0 or 1 as Y high)
  • x is X position bit 7

It also uses the fact that shifting affects the carry flag and the roll opcodes (ROL and ROR) allows you to shift the carry flag into the accumulator or RAM. That allows you to shift the screen number and add the other two bits to the offset at the same time!
Obviously, X and Y position (at least the high bytes) are swapped in vertical levels, although this procedure is done beforehand, alongside with the rest of the item memory code.

Edit: Argh! I have swapped the X and Y high bytes! >_>
Random shuffle table.

Code
 if !sa1 == 0
  !Scratchram_PermutationShuffleTbl = $7F844A
 else
  !Scratchram_PermutationShuffleTbl = $400198
 endif

;Don't touch this
	!dp = $0000
	!addr = $0000
	!sa1 = 0
	!gsu = 0

if read1($00FFD6) == $15
	sfxrom
	!dp = $6000
	!addr = !dp
	!gsu = 1
elseif read1($00FFD5) == $23
	sa1rom
	!dp = $3000
	!addr = $6000
	!sa1 = 1
endif

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;Permutation/Shuffle algorithm, by GreenHammerBro
;
;Randomly arranges 8-bit integer values in a table at an address
;defined by  "!Scratchram_PermutationShuffleTbl". For example,
;shuffle 3 values:
;
;$7F844A: 00 01 02
;
;After calling the routine, it MAY be this
;
;$7F844A: 02 00 01
;
;Input:
; -$00 = The number of values you want to shuffle on
;  this table minus 1 (if you want to shuffle 3 values, have
;  $00 contains the value #$02). Up to 128 (127) values is safe
;  (you wouldn't have that many values).
;Overwritten/Destroyed:
; -$01 = Used for swapping two "selected" numbers for the RNG
;  arrangement.
;Output:
; -!Scratchram_PermutationShuffleTbl will be shuffled.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
PermutationShuffle:
	LDA $00
	TAX						;>Start at  last number and work towards first ("Select index", the "countdown")
	
	.Loop
	LDA $00						;>Shuffle this number amount of values.
	PHY						;\
	JSL RNG						;|>number ("Random Select") (Y value gets overwritten in routine $01ACF9).
	PLY						;/
	TAY						;>Random index
	LDA !Scratchram_PermutationShuffleTbl,x		;\Swap the values in the table between "Select index" and "Random Select"
	STA $01						;|>Move "Select Index" to $01
	PHX						;|\Because $xxxxxx,y doesn't exist, temporally transfer "Random Select"
	TYX						;|/to X index
	LDA !Scratchram_PermutationShuffleTbl,x		;|>Load the "Random Select"
	PLX						;|>Restore X back to the countdown index
	STA !Scratchram_PermutationShuffleTbl,x		;|>Write "Random Select" on "Select Index"
	LDA $01						;|>"Select Index"
	PHX						;|
	TYX						;|>Temporally move "Random Select" to X again
	STA !Scratchram_PermutationShuffleTbl,x		;|>Write "Select Index" to "Random Select"
	PLX						;/>Restore X back to "countdown"
	
	DEX						;\Next item until "Select index" goes from $00 -> $FF
	BPL .Loop					;/
	RTL


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;Akaginite's (ID:8691) better ranged RNG
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;*Input:
; *A (8-bit) = the maximum number
;*Output
; *A (8-bit) = result (0 to max, inclusive).
;
;Formula:
; int(rand*(max+1)/256)
;
;Do note that if using SA-1, the multiplication registers
;are signed, be careful not to use values more than #$7FFF
;(32767 in decimal).
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

RNG:	INC A			;>Because the number is truncated, without it, would be 1-less than intended max.
	BEQ .All		;>If decided to use all possible values, simply no calculations for range needed.
	if !sa1 == 0
		STA $4202	;>Max number times RNG(0,255)
	else
		STA $2251
		STZ $2252	;>Remove high byte
	endif
	JSL $01ACF9		;\get random number 0-255
	if !sa1 == 0
		STA $4203	;/
		XBA		;\Wait 8 cycles
		XBA		;|
		NOP		;/
	else
		STA $2253
		STZ $2254		;>This tells SA-1 to perform the calculation here.
		NOP			;\Wait till calculations is done
		BRA $00			;/
		endif
	if !sa1 == 0
		LDA $4217		;>Value 0 to max. (loading high byte as 8-bit = LSR 8 times; divide by 256).
	else
		LDA $2307
	endif
	RTL

.All	JML $01ACF9
	RTL


Simply put you have numbers stored as a RAM table, this routine will shuffle their positions.

For example, having values 09 F9 11 02 9D, it can be:

7F844A: 02 9d f9 11 09

7F844A: 9d 09 11 f9 02

7F844A: 11 9d 02 09 f9

and so on. This is good for mini-games involving which “thing” contains what and the items the set has is always the same but just rearranged, or bosses fought in a random order. I highly recommend using the “Better Random Number Generator” as SMW's RNG is flawed that it's cleared on level load.

It works by having two selectors that swap each-other out, one is index-based (countdown index or “Select Index”) that always starts at the last number and goes to the previous number each loop, and the other, the “Random Select” will pick a random number (as in, which number) anywhere on the table, including a number being selected by the countdown index (which in that case won't swap on that loop), and numbers after being scanned by the countdown loop. This is very similar to Fisher-Yates shuffling algorithm, commonly known as the Knuth Shuffle.
Give thanks to RPG hacker for working on Asar.
Originally posted by Vitor Vilela
8. Find OAM slot. Useful for uberASM codes. Should work in both level and overworld. Note that it may have weird effects if you don't use (and enable) NMSTL:

Code
FindOAM:
	LDY #$FC
-	LDA $02FD,y
	CMP #$F0
	BNE +
	CPY #$3C
	BEQ +
	DEY
	DEY
	DEY
	DEY
	BRA -
+	RTS


SA-1:
Code
FindOAM:
	LDY #$FC
-	LDA $62FD,y
	CMP #$F0
	BNE +
	CPY #$3C
	BEQ +
	DEY
	DEY
	DEY
	DEY
	BRA -
+	RTS

Change whats highlighted to BEQ. It keeps searching for used slots instead of empty slots.
Give thanks to RPG hacker for working on Asar.
It's correct, actually (an unrolled version is used for NMSTL in fact). What the code does is to look if the previous slot is full (note that $02FD,y is used, not $0300,y) which means, due to how the code works, the next slot is free. It's done this way because when sprites place their tiles, the OAM index is increased, not decreased.
Oh, I get it now, so say index $20 is full, Y will be $20, but $24 is to be used. Here is my own OAM handling code:
Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;Make sure you have the index register be 16-bit (REP #$10)
;and A be 8-bit (SEP #$20) before calling this.
;
;Input:
; Y (16-bit): the starting index to search for OAM.
;             Y = 508 ($1FC) would search all 128 slots.
;             StartingIndex = SlotYouWantToStart*4
;Output:
; Y (16-bit): The index that represents a free slot.
;             if Y = $FFFC, then all slots are taken.
; $00:        The modulo of Yregister MOD 4, used to
;             specify what pairs of bits for $0420 table.
; $01:        Which byte of the $0420 table.
FindOAM:
	.Loop
	LDA $0201|!addr,y		;>Y position
	CMP #$F0			;\Free slot found.
	BEQ .Free			;/
	
	.Next
	DEY #4				;\Next group of 4 bytes
	BPL .Loop			;/
	SEP #$20
	RTL
	
	.Free
	REP #$20
	TYA			;>Transfer IndexNumb to A
	LSR #2			;>Convert IndexNumb -> SlotNumb
	AND #$0003		;>MOD 4
	SEP #$20
	STA $00			;>Store pair bits position to $00.
	REP #$20
	TYA			;\Round downwards to the nearest 4th value
	LSR #2
	SEP #$20
	AND.b #%11111100
	STA $01
	RTL


Let me know if there are room for improvement. Here is an uberasm tool test (test in level):
Code
!Freeram_OAMXPos = $60

main:
	LDA $13D4|!addr
	BNE .Done
	LDA $15
	BIT.b #%00000010
	BNE .Left
	BIT.b #%00000001
	BNE .Right
	BRA .OAMWrite
	
	.Left
	REP #$20
	DEC !Freeram_OAMXPos
	SEP #$20
	BRA .OAMWrite
	
	.Right
	REP #$20
	INC !Freeram_OAMXPos
	SEP #$20
	
	.OAMWrite
	REP #$10
	LDY #$1FC
	JSL GHBOAMSearch_FindOAM
	CPY #$FFFC
	BEQ .Done
	LDA !Freeram_OAMXPos
	STA $0200|!addr,y
	LDA #$10
	STA $0201|!addr,y
	LDA #$40
	STA $0202|!addr,y
	LDA.b #%00110000
	STA $0203|!addr,y
	PHY
	
	REP #$20
	LDA $01
	AND #$00FF
	TAY
	LDA $00
	AND #$00FF
	TAX
	SEP #$20
	LDA $0420|!addr,y
	ORA BitTableTileSize,x
	STA $0420|!addr,y
	
	LDA !Freeram_OAMXPos+1
	BEQ .OnScreen
	
	LDA BitTableXHigh,x
	ORA $0420|!addr,y
	STA $0420|!addr,y
	BRA +
	
	.OnScreen
	LDA BitTableXHigh,x
	EOR #$FF
	AND $0420|!addr,y
	STA $0420|!addr,y
	+
	PLY
	
	.Done
	SEP #$30
	RTL
	
BitTableXHigh:
	db %00000001
	db %00000100
	db %00010000
	db %01000000
BitTableTileSize:
	db %00000010
	db %00001000
	db %00100000
	db %10000000

Give thanks to RPG hacker for working on Asar.
My newer OAM writing for uberasm tool (works best for level and doesn't work on gamemode).

Make sure this doesn't execute every frame during pause since JSL $7F8000 isn't executed during it (it will start filling up the slots each frame).

Also note that you must use this patch in order to prevent weirdness from happening.

This one handles the “extended bits” such as the x position bit 8 (there are actually 9 bits; bits 0-8 instead of normal bits 0-7) and the tile size ($0420 table), therefore it handles displaying sprites tiles that overlap the left edge of the screen without wrapping to the right edge and the size of the tiles.

It also have “smart” off-screen check, simply put, if the tile's body is entirely offscreen (X or Y position is $FFF8 (for 8x8 tile) or $FFF0 (for 16x16)), it sets the Y position to #$F0 as a free slot.

The routine is also “stackable”, as in when you need call it again, you don't have to set the Y value for the OAM index again, as it will pick up right where it left off, without potential unnecessary already-checked slots.

Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;Make sure you have the index register be 16-bit (REP #$10)
;and A be 8-bit (SEP #$20) before calling this.
;
;Input:
;OAM writer for uberasm tool
;Make sure you have the index processor flag be 16-bit (REP #$10).
;
; Y (16-bit): the starting index to search for OAM.
;             Y = 508 ($1FC) would search all 128 slots.
; $00-$01:    OAM X position
; $02-$03:    OAM Y position
; $04:        OAM tile number
; $05:        OAM tile properties
; $06:        Tile size (0 = 8x8, 1 = 16x16)
;
;Output:
; Y (16-bit): Index that represent the previous slot that
;             is free.
;             if Y = $FFFC, then all slots are taken.
; Carry:      Set if offscreen and clear if onscreen.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
WriteOAM:
	.FindFreeSlot
	..Loop
	LDA $0201|!addr,y		;>Y position
	CMP #$F0			;\Free slot found.
	BEQ .SlotFound			;/
	
	...Next
	DEY #4				;\Next group of 4 bytes
	BPL ..Loop			;/
	SEP #$20
	RTL
	
	.SlotFound
	..YPosHandle
	REP #$20
	LDA $06					;\Position #$FFF8 (X and Y) is invisible for 8x8 tiles and
	ASL					;|Position #$FFF0 is invisible for 16x16 tiles.
	AND #$00FF				;/
	TAX
	LDA $02					;\Don't take up a slot of offscreen vertically.
	CMP #$00E0				;|
	BCC .HandleXPos				;|
	CMP.l TopLeftScreenBoundaryPos,x	;|
	BCS .HandleXPos				;/

	.OffScreen
	SEP #$20
	LDA #$F0			;\Be a free slot and not write.
	STA $0201|!addr,y		;/
	SEC
	RTL
	
	.HandleXPos
	SEP #$20
	LDA $02				;\Y position
	STA $0201|!addr,y		;/
	
	..XPositionHandle
	REP #$20
	LDA $00					;\Determine if on-screen horizontally
	CMP #$0100				;|
	BCC ..OnScreenHoriz			;|
	CMP.l TopLeftScreenBoundaryPos,x	;|
	BCS ..OnScreenHoriz			;/
	BRA .OffScreen				;>If not skip the whole thing.
	
	..OnScreenHoriz
	SEP #$20
	LDA $00					;\X position bits 0-7
	STA $0200|!addr,y			;/
	
	REP #$20
	TYA			;>Take index
	LSR #2			;>Divide by 4 to obtain slot number
	SEP #$20		;\Round down to the nearest 4th value (Value = floor(SlotNumb/4)*4)
	AND.b #%11111100	;/
	STA $02			;>Store $0420 indexer to $02
	REP #$20
	TYA			;>Transfer IndexNumb to A
	LSR #2			;>Convert IndexNumb -> SlotNumb
	SEP #$20
	AND.b #%00000011	;>MOD 4
	STA $03			;>Store pair bits position to $03.
	
	PHY
	REP #$20
	LDA $02			;\Because indexes must be 16-bit, I had to clear their high bytes.
	AND #$00FF		;|
	TAY			;/
	LDA $03
	AND #$00FF
	TAX
	SEP #$20
	LDA $01			;\If X position negative, set bit to enable tile
	BNE ...SetXPosBit8	;/exceeding the left edge of screen without wrapping.
	
	...ClearXPosBit8
	LDA BitTableXHigh,x
	EOR #$FF
	AND $0420|!addr,y
	STA $0420|!addr,y
	BRA ...TileSize
	
	...SetXPosBit8
	LDA $0420|!addr,y
	ORA BitTableXHigh,x
	STA $0420|!addr,y
	
	...TileSize
	LDA $06
	BNE ...SixteenBySixteenTile
	
	...EightByEightTile
	LDA BitTableTileSize,x
	EOR #$FF
	AND $0420|!addr,y
	STA $0420|!addr,y
	BRA .HandleTiles
	
	...SixteenBySixteenTile
	LDA $0420|!addr,y
	ORA BitTableTileSize,x
	STA $0420|!addr,y
	
	
	.HandleTiles
	PLY
	LDA $04				;\Tile number
	STA $0202|!addr,y		;/
	LDA $05				;\Tile properties
	STA $0203|!addr,y		;/
	
	.Done
	DEY #4
	CLC
	RTL
	
	TopLeftScreenBoundaryPos:
	dw $FFF9, $FFF1

	BitTableXHigh:
	db %00000001
	db %00000100
	db %00010000
	db %01000000
	
	BitTableTileSize:
	db %00000010
	db %00001000
	db %00100000
	db %10000000


Sample usage:
Code
!Freeram_OAMXPos = $60
!Freeram_OAMYPos = $62
main:
	LDA $15
	BIT.b #%00000010
	BNE .Left
	BIT.b #%00000001
	BNE .Right
	BRA +
	
	.Left
	REP #$20
	DEC !Freeram_OAMXPos
	SEP #$20
	BRA +
	
	.Right
	REP #$20
	INC !Freeram_OAMXPos
	SEP #$20
	
	+
	LDA $15
	BIT.b #%00001000
	BNE .Up
	BIT.b #%00000100
	BNE .Down
	BRA +
	
	.Up
	REP #$20
	DEC !Freeram_OAMYPos
	SEP #$20
	BRA +
	
	.Down
	REP #$20
	INC !Freeram_OAMYPos
	SEP #$20
	
	+
	REP #$30
	LDA !Freeram_OAMXPos
	STA $00
	LDA !Freeram_OAMYPos
	STA $02
	SEP #$20
	LDA #$40
	STA $04
	LDA.b #%00110000
	STA $05
	LDA #$01
	STA $06
	LDY #$01FC
	JSL LibraryOAMWrite_WriteOAM
	
	;Second OAM slot test
	REP #$20
	LDA !Freeram_OAMXPos
	CLC
	ADC #$0010
	STA $00
	LDA !Freeram_OAMYPos
	STA $02
	SEP #$20
	LDA #$40
	STA $04
	LDA.b #%00110000
	STA $05
	LDA #$01
	STA $06
	JSL LibraryOAMWrite_WriteOAM
	SEP #$30
	RTL


The problem is that SMW's sprites uses sprite OAM allocation instead of manually checking if the OAM slot is free, therefore overwriting sprites that were allocated to use slots used by this code.
Give thanks to RPG hacker for working on Asar.
Originally posted by MarioE
Here is a square root approximation routine:
Link

In order to use it, load a 16-bit value into A, with the carry bit acting as a 17th bit and call it. It will return the approximation to the square root as a 16-bit value in A, with the carry bit acting as a 17th bit. The return value should be interpreted as 9.8 fixed point.

For example, a return value of $45A0 (C) should be interpreted as $0145.A0.


I don't understand what you mean by 9.8 fixed point, you mean 9 bits before radix point and 8 bits after (%000000000.00000000)?
Give thanks to RPG hacker for working on Asar.
Updated my aligned digits display.
Give thanks to RPG hacker for working on Asar.
I made a routine that determines what is the horizontal level dimension is, since using $5E isn't 100% reliable when that value could be smaller when the last nth columns of screens are not used. It is done by checking $13D7 and comparing with a list of every level dimension setting's height. Thankfully, there is no settings with the same height for a different number of screens, otherwise this routine will break by choosing the last of the matching heights.

Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;Index what horizontal level dimension currently set by LM that the
;;player is in.
;;
;;Output:
;; X (8-bit) = The index number (increments of 2 for every item setting),
;;             corresponding to what setting in LM's
;;             "Change Properties in Header" -> "Horizontal Level mode"'s
;;             drop-down box. Some examples:
;;
;;             $00 = LVL height = 01B tiles, H-Screens=20 -> X = $00
;;             $01 = LVL height = 01C tiles, H-Screens=20 -> X = $02
;;             $02 = LVL height = 01D tiles, H-Screens=1E -> X = $04
;;             ...
;;             Returns $FE should no setting be found (can be used as
;;             a failsafe detection for invalid level dimension).
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	IndexLevelDimension:
	REP #$20
	LDX.b #(LevelDimensionHeightsEnd-LevelDimensionHeights)-2	;>Start at last item.
	
	.Loop
	LDA $13D7|!addr			;>Current level height
	CMP LevelDimensionHeights,x
	BEQ .HeightFound
	
	..Next
	DEX #2
	BPL .Loop
	
	.HeightFound
	SEP #$20
	RTL
	
	LevelDimensionHeights:
	dw $01B0		;>Setting $00 (index = $00)
	dw $01C0		;>Setting $01 (index = $02)
	dw $01D0		;>Setting $02 (index = $04)
	dw $0200		;>Setting $03 (index = $06)
	dw $0220		;>Setting $04 (index = $08)
	dw $0250		;>Setting $05 (index = $0A)
	dw $0260		;>Setting $06 (index = $0C)
	dw $0280		;>Setting $07 (index = $0E)
	dw $02A0		;>Setting $08 (index = $10)
	dw $02C0		;>Setting $09 (index = $12)
	dw $02F0		;>Setting $0A (index = $14)
	dw $0310		;>Setting $0B (index = $16)
	dw $0340		;>Setting $0C (index = $18)
	dw $0380		;>Setting $0D (index = $1A)
	dw $03B0		;>Setting $0E (index = $1C)
	dw $0400		;>Setting $0F (index = $1E)
	dw $0440		;>Setting $10 (index = $20)
	dw $04A0		;>Setting $11 (index = $22)
	dw $0510		;>Setting $12 (index = $24)
	dw $0590		;>Setting $13 (index = $26)
	dw $0630		;>Setting $14 (index = $28)
	dw $0700		;>Setting $15 (index = $2A)
	dw $0800		;>Setting $16 (index = $2C)
	dw $0950		;>Setting $17 (index = $2E)
	dw $0B30		;>Setting $18 (index = $30)
	dw $0E00		;>Setting $19 (index = $32)
	dw $12A0		;>Setting $1A (index = $34)
	dw $1C00		;>Setting $1B (index = $36)
	dw $3800		;>Setting $1C (index = $38)
	LevelDimensionHeightsEnd:


I made use of the labels around the lookup table in the event should more items in the list to be added in the future version of LM. The index starter before the loop (LDX.b #(LevelDimensionHeightsEnd-LevelDimensionHeights)-2) will be updated without the need for the user to edit that as well. This is part of my project of making a routine that converts a block coordinate (in units of blocks, not pixels) to $C800 index, useful if you want to store blocks in memory (like a locked gate, that once unlocked, stores its data in RAM so it doesn't respawn when re-entering the level) using $C800 indexes (2 bytes) instead of coordinates (4 bytes total for X and Y).

Edit: Now added the two new routines:


Reverse version:

Other routines required:


Currently, this ONLY works with LM3+ and does not support vertical levels (I may update it to support vertical levels).

4/24/2019 EDIT: Optimized the Index -> coords code, no longer needs an extra RAM.
Give thanks to RPG hacker for working on Asar.
A routine I made that makes writing tiles from the table to the status bar a lot easier (Format2 means a routine designed for SSB and OWB+ data format):

And the example use (super status bar):
Code
	LDY.b #(TestTable_end-TestTable)-1
	LDA.b #TestTable
	STA $00
	LDA.b #TestTable>>8
	STA $01
	LDA.b #TestTable>>16
	STA $02
	LDA.b #!StatusBarPos
	STA $03
	LDA.b #!StatusBarPos>>8
	STA $04
	LDA.b #!StatusBarPos>>16
	STA $05
	JSL SBWriteCode_TransferTableToStatusBarFormat2
	rtl
TestTable:
	db $1d ;form the text "test"
	db $0e
	db $1c
	db $1d
.end


You can also write $03 to be the tile properties if you wanted to modify that as well.
Give thanks to RPG hacker for working on Asar.
Bit flag table handling:
Code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;Bit search. This is basically euclidean division
;by 8 to determine what bit and byte to read and
;write. Very useful if you have a table where each
;item is 1 bit large instead of a byte.
;
;Input:
; -A (8-bit) = what bit number (a flag), 0-255 ($00-$FF)
;Output:
; -X (8-bit) = What byte in byte-array to check from.
;  Up to X=31 ($1F) due to floor(255/8).
; -Y (8-bit) = what bit number in each byte: 0-7.
;
;To set a bit:
; LDA <Bitnumber>
; JSL BitToByteIndex
; LDA BitSelectTable,y
; ORA !RAMTable,x
; STA !RAMTable,x
; [...]
; BitSelectTable:	;Applies to all example codes presented here.
;  db %00000001 ;>Bit 0
;  db %00000010 ;>Bit 1
;  db %00000100 ;>Bit 2
;  db %00001000 ;>Bit 3
;  db %00010000 ;>Bit 4
;  db %00100000 ;>Bit 5
;  db %01000000 ;>Bit 6
;  db %10000000 ;>Bit 7
;
;To clear a bit:
; LDA <Bitnumber>
; JSL BitToByteIndex
; LDA BitSelectTable,y
; EOR.b #%11111111
; AND !RAMTable,x
; STA !RAMTable,x
;
;To read/check a bit:
; LDA <Bitnumber>
; JSL BitToByteIndex
; LDA !RAMTable,x
; AND BitSelectTable,y			;>Clear all bits except the bit we select.
; BEQ BitInTableClear			;\Conditions based on bit in table set or clear.
; BNE BitInTableSet			;/
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
BitToByteIndex:
	PHA			;>A as input preserved.
	AND.b #%00000111	;>WhatBit = Bitnumber MOD 8
	TAY			;>Place in Y.
	PLA			;>Restore what was originally in the input.
	LSR #3			;>ByteNumber = floor(Bitnumber/8)
	TAX			;>Place in X.
	RTL


(surprisingly simple)

Very useful in hacks if you want to use bitwise-based (as opposed to “byte-wise”) flags (most games involving “pickup flags” in which items you obtain do not respawn when re-entering the room).
Give thanks to RPG hacker for working on Asar.
Originally posted by MarioE
Using an inlined reciprocal square root routine, I've created a pretty accurate aiming routine. Link

In order to use it, set $00 to be the 16-bit value (shooter_x - target_x), set $02 to be the 16-bit value (shooter_y - target_y), and set A to be the 8-bit projectile speed. It will return the projectile's X speed in $00 and the projectile's Y speed in $02. Note that distances over $0100 pixels are not allowed.

Explanation: Suppose one fired the projectile with X speed dx, and Y speed dy. Then its speed would be sqrt(dx2+dy2). Thus, we can adjust its speed by multiplying by speed / sqrt(dx2+dy2). This routine calculates the reciprocal 1 / sqrt(dx2+dy2), multiplies by speed, then multiplies by either dx or dy.

erik edit: fixed link because dropbox
added the sa-1 compatible variant (requires a !SA1 detection if you're using spritetool)



Let me explain further about this. This uses the Pythagorean theorem's distance formula. We have to solve the X and Y speed (Speed_X and Speed_Y) since that's how the game processes the physics via axis-based. We are given the intended diagonal speed (Speed_hypotenuse), and the leg distance (DeltaX_Leg and DeltaY_Leg), after getting the sqrt(dx2+dy2) length, doing the reciprocal means doing this: Speed_X or Speed_Y = (Speed_hypotenuse * [DeltaX_Leg or DeltaY_Leg])/sqrt(dx2+dy2) → acts like a unit converter to convert distance vector to speed vector, which is scaling. I think it was implied that it took a ratio: hypotenuse/IntendedSpeed, and tries to solves the XY speed over their leg distances via cross-multiply.

Edit: the reciprocal is there so to normalize the vector, so we can scale it along the x and y speed.
Give thanks to RPG hacker for working on Asar.