Diff between 2.4.5pre5aa1 and 2.4.5pre5aa2:

Only in 2.4.5pre5aa1: 00_alpha-tsunami-1G-dynamic-pci-1
Only in 2.4.5pre5aa2: 00_alpha-tsunami-1G-dynamic-pci-2

	Fixes a silly bug for the <= 2G tsunamis, nothing changes for the
	>2G boxes or non tsunami machines.

	(recommended update)

New tsunami-iommu-align-2 is here (it needed a rediff due rejectes):

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.5pre5aa2/tsunami-iommu-align-2

---------------------------------------------------------------------------

The main news of 2.4.5pre5aa1 are the alpha updates in particular the iommu
fixes that are supposed to stablize the big alpha boxes in production. The most
important bit may be still missing from this diff because I want to force
people to test a possible alternate "real" fix for the iommu instability.
Maybe the only bug that had to be fixed is the one addressed by
00_alpha-iommu-tlb-corruption-1 that I spotted and fixed today.

So in short if 2.4.5pre5aa1 still crashes with tons of ram on the tsunami
chipsets, then you can apply this incremental patch on top of 2.4.5pre5aa1 and
it should stabilize again:

	ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.5pre5aa1/tsunami-iommu-align-1

I just want to make 100% sure that the above patch introducing the artifical
alignment to avoid invalid (but not corrupted) pte to be cached in the tlb is
necessary and that the real bug wasn't the corrupted ptes generated during
pci_map_* and fixed by 00_alpha-iommu-tlb-corruption-1.

---------------------------------------------------------------------------

Detailed description on the diff between 2.4.5pre2aa1 and 2.4.5pre5aa1 follows.

---------------------------------------------------------------------------

Only in 2.4.5pre5aa1: 00_alpha-cia-1

	Fix for CIA by Ivan plus a few additional fixes.

	(recommended)

Only in 2.4.5pre5aa1: 00_alpha-cypress-quirk-1

	Never use pci32 addresses aboe -1M to avoid confusing cypress chip.

	(recommended)

Only in 2.4.5pre5aa1: 00_alpha-dp264-compile-1

	Allows non generic dp264 compiles to link.

	(nice to have)

Only in 2.4.5pre5aa1: 00_alpha-iommu-locking-1

	Fixes races in the pci_unmap_* paths of alpha.

	(recommended)

Only in 2.4.5pre5aa1: 00_alpha-iommu-printk-1

	Convert to warnings printk notifying about pci_map_* faliures (should
	be KERN_CRIT but think positive and assume we'll fix the bugs eventually ;).

	(nice to have)

Only in 2.4.5pre5aa1: 00_alpha-iommu-tlb-corruption-1

	Avoid corrupting the tlb and possibly crash the iommu with pagetables
	bits that must be zero.

	(recommended)

Only in 2.4.5pre2aa1: 00_alpha-numa-10
Only in 2.4.5pre5aa1: 00_alpha-numa-11

	Minor changes in the alpha-numa support, compilation issues.

	(update)

Only in 2.4.5pre5aa1: 00_alpha-sg-direct-1

	Fix for the patch from Richard that uses the direct window during SG
	mappings if we run out of pci dynamic mappings (by the time we run out
	of dynamic mappings we're going to break very soon anyways, but what
	this patch does is certainly a good idea).

	(recommended)

Only in 2.4.5pre5aa1: 00_alpha-spinlock-debug-tmo-1

	Increase the spinlock debugging timeout trigger for the faster cpus.
	(spinlock debugging is disabled anyways)

	(nice to have)

Only in 2.4.5pre5aa1: 00_alpha-tsunami-1G-dynamic-pci-1

	Increase the dynamic window for tsunami from 128M to 1G (make sure
	to apply also the 00_alpha-cypress-quirk-1 if you apply this by hand).
	This tries to better hide the bugs in the drivers that doesn't check
	for pci_map_* faliures.

	(recommended)

Only in 2.4.5pre5aa1: 00_alpha-tsunami-no-wmb-1

	Drop a not necessary wmb().

	(nice to have)

Only in 2.4.5pre5aa1: 00_alpha-tsunami-pci64-enable-1

	Enable monster window on the tsunami chip plus define
	TSUNAMI_DAC_OFFSET.  This is a backdoor to allow device drivers that
	are aware they are running on a tsunami chipset to use DAC on pci64.
	What the driver needs to do in order to convert a physical address to a
	pci64 bus address is to `logical or' the physical address with the
	TSUNAMI_DAC_OFFSET (that is 1<<40), and then program the chip to start
	the dma on the so calculated 64bit bus address.  This is a very
	primitve backdoor, implementing a virt_to_bus64 isn't that far and a
	virt_to_bus64 would be enough for alpha, plus in any case we'll also
	need a common code abstraction to allow the driver to know if it can
	use DAC or not, which is missing as well at the moment.

	It is recommended to use this backdoor only to decrease the pressure
	on the iommu mappings, so that we decrease the probability of
	triggering all the device driver bugs.

	(nice to have for most, recommended for anybody running out of entries
	with a pci64 device on a tsunami chip)

Only in 2.4.5pre5aa1: 00_boot-serial-console-1

	Allows the serial console to work anytime during boot. It may have side
	effects but certainly nothing relevant and the current situation was
	annoying enough.

	(nice to have)

Only in 2.4.5pre5aa1: 00_eepro100-64bit-1

	Fixes a 64bit bug that was generating false positives and memory
	corruption.

	(recommended)

Only in 2.4.5pre2aa1: 00_nfs-corruption-3

	Dropped the flush at unmap thing, replaced with the flush at release
	from mainline. NOTE: the 00_nfs-corruption-3 was not buggy for
	production, this is just an alternate better way of fixing the
	corruption.

Only in 2.4.5pre2aa1: 00_rwsem-11
Only in 2.4.5pre5aa1: 00_rwsem-13

	Updated to compile. I had to disable ppc asm optimizations as I didn't
	had time to look into them. Unless somebody sends me a patch I will look
	into it shortly (once I return in optimization mode, now I am in
	stability mode ;).

	(update)

Only in 2.4.5pre5aa1: 00_xircom-serial-1

	Merged a fix posted on l-k that makes xircom integrated modem to work
	fine (without this artificial delay during initialization it really
	doesn't work).

	(nice to have)

Only in 2.4.5pre2aa1: 00_xircom-tulip-cb-arjanv-1.bz2
Only in 2.4.5pre5aa1: 00_xircom-tulip-cb-arjanv-2.bz2

	Update with the latest driver at http://people.redhat.com/arjanv

---------------------------------------------------------------------------

Detailed description of 2.4.5pre2aa1 follows.

---------------------------------------------------------------------------
00_alpha-illegal-irq-1

	Be verbose for MAX_ILLEGAL_IRQS times if an invalid irq number
	is getting run.

	(debugging)

00_alpha-ksyms-1

	Export a few alpha-arch symbols needed by modules.

	(recommended to avoid compilation troubles)

00_alpha-large-vmalloc-1

	Drop the CONFIG_LARGE_VMALLOC selection from the
	arch/alpha/config.in, the large-vmalloc feature is racy and it can
	destabilize the machine, fixing it isn't worthwhile because
	nobody needs more than 8Gigabytes of ram in vmalloc memory,
	not even tux on the 256G boxes will ever need that.

	(recommended)

00_alpha-modrace-1

	Fix alpha races between module insmod/rmmod and the page fault
	fixmap lookup.

	(recommended)

00_alpha-numa-10

	Fully support wildfire machines with all kind of NUMA memory
	configuration, plus it optimizes the allocation on per node
	basis to boost the performance on the NUMA boxes. Right now
	CONFIG_WILDFIRE needs to be selected to take advantage of this
	feature. (CONFIG_GENERIC + CONFIG_DISCONTIGMEM=y and
	CONFIG_NUMA=y will work fine as well but it won't take
	advantage of the new feature. It also fixes many memory
	management bit in the core linux allocator in the common code,
	mostly to avoid wasting static memory.

	(recommended)

00_alpha-sched-yield-1

	Fixes SCHED_YIELD on the alpha arch for UP compiles.

	(recommended)

00_alpha-show_stack-1

	Implements the show_stack() call used often by some common
	code, mostly to allow compilation, things like tux needs it.

	(nice to have)

00_alpha-tlb-page-sym-1

	Drops a not necessary export on the alpha port.

	(recommended)

00_buffer-2

	Reschedule during oom while allocating buffers, still getblk
	can deadlock with oom but this will hide it pretty well as
	it won't loop in a tight loop anymore.

	(recommended)

00_cachelinealigned-in-smp-1

	Moves the pagecache_lock and the VM pagemap_lru_lock in two
	different L1 cachelines to avoid contention, mostly useful on
	the alpha where the spinlocks uses load locked store
	conditional loops (and we don't want to loop).

	(nice to have)

00_copy-user-lat-2

	Put the rechedule points into copy-user calls, with lots of
	cache large read/writes could otherwise _never_ reschedule
	once until they returns to userspace.

	(recommended)

00_cpus_allowed-1

	Fixes a bug in the cpu affinity in-kernel API, bug was fatal
	for ksoftirqd.

	(recommended)

00_double-buffer-pass-1

	Avoids looping two times for no good reason into the lru lists
	of the buffer cache (the double loop was an unreliable hack
	from the prehistory that survided 'till today).

	(nice to have)

00_exception-table-1

	Avoids a compilation warning when compiling without modules.

	(very minor thing)

00_highmem-deadlock-3

	Fixes an highmem deadlock using a reserved pool for the bounce
	buffers.

	(recommended)

00_highmem-debug-1

	Allows people with x86 machines with less than 1G of ram to
	test the highmem code.

	(debugging)

00_ia32-bootmem-corruption-1

	Fixes the x86 boot stage to finish initializing all the
	reserved memory before starting allocating memory.

	(recommended)

00_ipv6-null-oops-1

	Fixes null pointer oops.

	(recommended)

00_jens-loop-noop-nobounce-1

	Skips the bounces with the null transfer function.

	(nice to have)

00_ksoftirqd-4

	Avoids 1/HZ latency for the softirq if the softirq is marked
	again pending when do_softirq() finished and the machine is
	otherwise idle, it also fixes the case of a softirq re-marking
	itself runnable by delegating to the scheduler the balance of
	the softirq load like if it would be an normal task.

	(nice to have)

00_kupdate-large-interval-1

	Allows to set large interval for the kupdate runs, this is
	useful on the laptops, instead of sigstopping ksoftirqd it's
	nicer to set a large interval for example of the order of one
	hour (do that at your own risk of course, doing that is not
	recommended unless you know what you're doing).

	(nice to have)

00_lvm-0.9.1_beta7-4

	Updates to the lvmbeta7 with fixes for the lv hardsectsize
	estimantion based on the max hardsectsize of the underlying
	pv, plus it has some other tons of fixes and it is a must have
	for the 64bit archs as the IOP silenty changed for those
	platforms.

	(recommended)

00_max_readahead-1

	Increases the max_readahead to allow the blkdev to read with
	512k scsi commands when possible.

	(nice to have)

00_msync-fb0-1

	Fixes oopses while running msync on a region of virtual memory
	that maps to reserved memory.

	(recommended)

00_nfs-corruption-3

	Production fix for the nfs map_shared fs data
	corruption. Other design solutions are discussed and the long
	term fix might be different but all the other approches are
	more invasive and risky, while this one is obviously right and 
	the most approriate for production.

	(recommended)

00_numa-sched-6

	Implements a basic numa scheduler with a per node runqueue
	that boosts the perforormance on numa boxes. It also enables
	and somewhere fixes the last_idle heuristic in the smp
	scheduler.

	The numa part doesn't impose any runtime overhead when
	CONFIG_NUMA_SCHED is not set.

	(nice to have)

00_o_direct-6

	Implements O_DIRECT zerocopy direct I/O in a filesystem
	indipendent fascion, currently it's only available on ext2,
	but it will be easy to let the other of fs to take advantage
	of it too.

	(nice to have)

00_pagetable-fast-2

	Enables the usage of the per-cpu quicklists for the pte/pgd to
	optimize the cache affinity and footprint.

	(nice to have)

00_peekurgdata-1

	Allows MSG_PEEK to work on the urgent (aka out of band)
	receive queue as well, needed for tux. (nice to have)

00_rwsem-11

	Alternate implementation for the rwsem, the x86 asm version is
	simpler in the slow path and it provides a faster up_write
	fast path, the C-spinlock version is much faster too.

	(nice to have)

00_sched-yield-1

	Fixes a bug in sched_yield, where SCHED_YIELD needs to be
	cleared after a schedule() if no other task was runnable,
	otherwise the next schedule() would inherit the SCHED_YIELD
	behaviour even if a SCHED_YIELD wasn't requested anymore.

	(nice to have)

00_show_regs-1

	Makes SYSRQ+P more verbose.

	(debugging)

00_slab-lists-1

	Rewrites the slab cache handling for partial and completly
	free slab objects, always provides LIFO behaviour from all
	the lists to reduce the cache misses, while previously the
	completly free slab objects were reallocated with a less
	optimal FIFO policy. It also cleanups the code.

	(nice to have)

00_softirq-SMP-fixes-3

	Fixes an SMP race in the softirq code on archs like the alpha
	where the atomic_t and bitop operations aren't memory barriers
	as well.

	(recommended)

00_sync-page-1

	Avoids suprious unplug of the tq_disk task queue while waiting
	I/O completion.

	(nice to have)

00_timer_t-2

	Defines the timer_t type in one single place.

	(nice to have)

00_waitqueue-2

	Setups the not yet visible waitqueue flags outside the
	critical section.

	(nice to have)

00_x86-systable-1

	Fills the end of the syscall table automatically, it is off by
	one without this patch.

	(nice to have)

00_xircom-tulip-cb-arjanv-1

	No need of ifconfig eth0 promisc on the xircom card with this
	driver.

	(nice to have)

10_no-virtual-1

	Avoids wasting tons of memory if highmem is not selected (like
	in all the 64bit ports).

	(nice to have)

10_parent-timeslice-7

	Fixes a scheduler unfairness generated by the parent-timeslice
	logic.

	(recommended)

10_read_ahead-2

	Setups readahead for lvm [global, hacking around read/write
	callbacks is broken] and drops sensless limits.

	(recommended)

20_share-timeslice-2

	Reschedule the child first but share the timeslice with the
	parent (I never triggered a single userspace bug with this
	correct patch)

	(nice to have)

The below patches aren't included by default but they can be applied incrementally
if tux is needed. TUX patches are been developed by Ingo Molnar.

30_atomic-alloc-1

	Defines a PF_ATOMICALLOC that won't sleep watiting memory to
	become available.

	(needed by tux)

30_atomic-lookup-1

	Implements an O_ATOMICLOOKUP flag to avoid entering the
	filesystem code that could wait for I/O in open, open will
	return -EWOULDBLOCKIO if the lookup wasn't doable in dcache.

	(needed by tux for the asynchronous vfs lookups)

30_net-exports-1

	Exports some networking function so that tux can be compiled
	as module,  and splits the sock_map_file functionality away
	from sock_map_fd.

	(needed by tux)

30_pagecache-atomic-1

	Defines a do_generic_file_read_atomic that implements
	nonblocking reads from cache, if information wasn't in cache
	the descriptor error flag is set to -EWOULDBLOCKIO.

	(needed by tux to implement asynchronous read I/O)

30_tux-1

	tux core.

	(nice to have)

30_tux-data-1

	Adds a tux_data private pointer to the sock structure.

	(needed by tux)

30_tux-dprintk-1

	Defines tux_Dprintk.

	(needed by tux)

30_tux-exports-1

	Exports to modules the in-kernel syscalls.

	(needed by tux)

30_tux-kstat-1

	Defines additional kstats for tux.

30_tux-process-1

	Defines per-process callbacks for tux.
30_tux-syscall-1

	Implements the tux-syscall, it's the one executed by the
	userspace tux program to fire up, shutdown and control the tux
	kernel threads.

	(needed by tux)

30_tux-sysctl-1

	Defines the symbolic names for the tux sysctl.

	(needed by tux)

30_tux-vfs-1

	Adds a per-dentry tux private data.

	(needed by tux)

31_tux-logger-1

	Aligns correctly the tux logentry for 8 byte cachelines too.

---------------------------------------------------------------------------

Have fun!

Andrea