Blob Blame History Raw
From b1bfbda896a9d9d8e8bd86dd08aac2b2f9928ce1 Mon Sep 17 00:00:00 2001
From: Ben Skeggs <bskeggs@redhat.com>
Date: Tue, 1 Jun 2010 15:32:24 +1000
Subject: [PATCH] drm-nouveau-updates
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

drm/nouveau: use drm_mm in preference to custom code doing the same thing

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove left-over !DRIVER_MODESET paths

It's far preferable to have the driver do nothing at all for "nomodeset".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: missed some braces

Luckily this had absolutely no effect whatsoever :)

Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: move LVDS detection back to connector detect() time

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Put the dithering check back in nouveau_connector_create.

a7b9f9e5adef dropped it by accident.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Thibaut Girka <thib@sitedethib.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Don't clear AGPCMD completely on INIT_RESET.

We just need to clear the SBA and ENABLE bits to reset the AGP
controller: If the AGP bridge was configured to use "fast writes",
clearing the FW bit would break the subsequent MMIO writes and
eventually end with a lockup.

Note that all the BIOSes I've seen do the same as we did (it works for
them because they don't use MMIO), OTOH the blob leaves FW untouched.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Ignore broken legacy I2C entries.

The nv05 card in the bug report [1] doesn't have usable I2C port
register offsets (they're all filled with zeros). Ignore them and use
the defaults.

[1] http://bugs.launchpad.net/bugs/569505

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: set encoder for lvds

fixes oops in nouveau_connector_get_modes with nv_encoder is NULL

Signed-off-by: Albert Damen <albrt@gmx.net>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: tidy connector/encoder creation a little

Create connectors before encoders to avoid having to do another loop across
encoder list whenever we create a new connector.  This allows us to pass
the connector to the encoder creation functions, and avoid using a
create_resources() callback since we can now call it directly.

This can also potentially modify the connector ordering on nv50.  On cards
where the DCB connector and encoder tables are in the same order, things
will be unchanged.  However, there's some cards where the ordering between
the tables differ, and in one case, leads us to naming the connectors
"wrongly".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: downgrade severity of most init table parser errors

As long as we know the length of the opcode, we're probably better off
trying to parse the remainder of an init table rather than aborting in
the middle of it.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix DP->DVI if output has been programmed for native DP previously

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: DCB quirk for Dell M6300

Uncertain if this is a weirdo configuration, or a BIOS bug.  If it's not
a BIOS bug, we still don't know how to make it work anyway so ignore a
"conflicting" DCB entry to prevent a display hang.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm: disable encoder rather than dpms off in drm_crtc_prepare_encoders()

Original behaviour will be preserved for drivers that don't implement
disable() hooks for an encoder.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: supply encoder disable() hook for SOR outputs

Allows us to remove a driver hack that used to be necessary to disable
encoders in certain situations before setting up a mode.  The DRM has
better knowledge of when this is needed than the driver does.

This fixes a number of display switching issues.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix regression caused by ed15e77b6ee7c4fa6f50c18b3325e7f96ed3aade

It became possible for us to have connectors present without any encoders
attached (TV out, we don't support TVDAC yet), which caused the DDX to
segfault.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv04: fix regression caused by ed15e77b6ee7c4fa6f50c18b3325e7f96ed3aade

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: when debugging on, log which crtc we connect an encoder to

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv17-nv40: Avoid using active CRTCs for load detection.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv04-nv40: Prevent invalid DAC/TVDAC combinations.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv04-nv40: Disable connector polling when there're no spare CRTCs left.

Load detection needs the connector wired to a CRTC, when there are no
inactive CRTCs left that means we need to cut some other head off for
a while, causing intermittent flickering.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix memory detection for cards with >=4GiB VRAM

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Fix a couple of sparse warnings.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: INIT_CONFIGURE_PREINIT/CLK/MEM on newer BIOSes is not an error.

No need to spam the logs when they're found, they're equivalent to
INIT_DONE.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv04-nv40: Drop redundant logging.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Move the fence wait before migration resource clean-up.

Avoids an oops in the fence wait failure path (bug 26521).

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Workaround broken TV load detection on a "Zotac FX5200".

The blob seems to have the same problem so it's probably a hardware
issue (bug 28810).

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: send evo "update" command after each disconnect

It turns out that the display engine signals an interrupt for disconnects
too.  In order to make it easier to process the display interrupts
correctly, we want to ensure we only get one operation per interrupt
sequence - this is what this commit achieves.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: rewrite display irq handler

The previous handler basically worked correctly for a full-blown mode
change.  However, it did nothing at all when a partial (encoder only)
reconfiguation was necessary, leading to the display hanging on certain
types of mode switch.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: move DP script invocation to nouveau_dp.c

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: set DP display power state during DPMS

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: add scaler-only modes for eDP too

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove dev_priv->init_state and friends

Nouveau will no longer load at all if card initialisation fails, so all
these checks are unnecessary.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: implement DAC disconnect fix missed in earlier commit

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: reduce usage of fence spinlock to when absolutely necessary

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: place notifiers in system memory by default

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: add instmem flush() hook

This removes the previous prepare_access() and finish_access() hooks, and
replaces it with a much simpler flush() hook.

All the chipset-specific code before nv50 has its use removed completely,
as it's not required there at all.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: move tlb flushing to a helper function

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove ability to use external firmware

This was always really a developer option, and if it's really necessary we
can hack this in ourselves.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: allocate fixed amount of PRAMIN per channel on all chipsets

Previously only done on nv50+

This commit also switches unknown NV2x/NV3x chipsets to noaccel mode.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove unused fbdev_info

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: cleanup nv50_fifo.c

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv20-nv30: move context table object out of dev_priv

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix dp_set_tmds to work on the right OR

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix mtrr cleanup path

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: move dp_set_tmds() function to happen in the last display irq

It seems on some chipsets that doing this from the 0x20 handler causes the
display engine to not ever signal the final 0x40 stage.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: initialise display before enabling interrupts

In some situations it's possible we can receive a spurious hotplug IRQ
before we're ready to handle it, leading to an oops.

Calling the display init before enabling interrupts should clear any
pending IRQs on the GPU and prevent this from happening.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Disable PROM access on init.

On older cards (<nv17) scanout gets blocked when the ROM is being
accessed. PROM access usually comes out enabled from suspend, switch
it off.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/i2c/ch7006: Fix up suspend/resume.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv04: Enable context switching on PFIFO init.

Fixes a lockup when coming back from suspend.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: fix RAMHT size

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: use correct PRAMIN flush register on original nv50

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove quirk to fabricate DVI-A output on DCB 1.5 boards

There's a report of this quirk breaking modesetting on at least one board.
After discussion with Francisco Jerez, we've decided to remove it:

<darktama> it's not worth limiting the quirk to just where we know it can
           work?  i'm happy either way really :)
<curro> hmm, don't think so, most if not all DCB15 cards have just one DAC
<curro> and with that quirk there's no way to tell if the load comes from
        the VGA or DVI port

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: support fetching LVDS EDID from ACPI

Based on a patch from Matthew Garrett.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Matthew Garrett <mjg@redhat.com>

drm/nv50: fix regression that break LVDS in some places

A previous commit started additionally using the SOR link when trying to
match the correct output script.  However, we never fill in this field
for LVDS so we can never match a script at all.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Fix a sparse warning.

It doesn't like variable length arrays.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Don't pick an interlaced mode as the panel native mode.

Rescaling interlaced modes isn't going to work correctly, and even if
it did, come on, interlaced flat panels? are you pulling my leg?

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Add another Zotac FX5200 TV-out quirk.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Add some PFB register defines.

Also collect all the PFB registers in a single place and remove some
duplicated definitions.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv04-nv3x: Implement init-compute-mem.

Init-compute-mem was the last piece missing for nv0x-nv3x card
cold-booting. This implementation is somewhat lacking but it's been
reported to work on most chipsets it was tested in. Let me know if it
breaks suspend to RAM for you.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Patrice Mandin <patmandin@gmail.com>
Tested-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Xavier Chantry <chantry.xavier@gmail.com>
Tested-by: Marcin Kościelnicki <koriakin@0x04.net>

drm/i2c/ch7006: Don't assume that the specified config points to static memory.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Add some generic I2C gadget detection code.

Clean up and move the external TV encoder detection code to
nouveau_i2c.c, it's also going to be useful for external TMDS and DDC
detection.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Remove useless CRTC_OWNER logging.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: No need to lock/unlock the VGA CRTC regs all the time.

Locking only makes sense in the VBIOS parsing code as it's executed
before CRTC init.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Reset CRTC owner to 0 before BIOS init.

Fixes suspend+multihead on some boards that also use BIOS scripts for
modesetting.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: fix build without CONFIG_ACPI

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: add nv_mask register accessor

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: add function to control GPIO IRQ reporting

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: disable hotplug detect around DP link training

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv30: Init the PFB+0x3xx memory timing regs.

Fixes the randomly flashing vertical lines seen on some nv3x after a
cold-boot.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Reset AGP before running the init scripts.

BIOS scripts usually make an attempt to reset the AGP controller,
however on some nv4x cards doing it properly involves switching FW off
and on: if we do that without updating the AGP bridge settings
accordingly (e.g. with the corresponding calls to agp_enable()) we
will be locking ourselves out of the card MMIO space. Do it from
nouveau_mem_reset_agp() before the init scripts are executed.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Put back the old 2-messages I2C slave test.

I was hoping we could detect I2C devices at a given address without
actually writing data into them, but apparently some DDC slaves get
confused with 0-bytes transactions. Put the good old test back.

Reported-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Move display init to a new nouveau_engine.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Get rid of the remaining VGA CRTC locking.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix TV-out detection on unposted cards lacking a usable DCB table.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: correct wait condition for instmem flush

Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: introduce gpio engine

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix some not-error error messages

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: use custom i2c algo for dp auxch

This makes it easier to see how this is working, and lets us transfer the
EDID in blocks of 16 bytes.

The primary reason for this change is because debug logs are rather hard
to read with the hundreds of single-byte auxch transactions that occur.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: set TASK_(UN)INTERRUPTIBLE before schedule_timeout()

set_current_state() is called only once before the first iteration.
After return from schedule_timeout() current state is TASK_RUNNING. If
we are going to wait again, set_current_state() must be called.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: remove unused ttm bo list

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Fix AGP reset when AGP FW is already enabled on init.

Previously nouveau_mem_reset_agp() was only disabling AGP fast writes
when coming back from suspend. However, the "locked out of the card
because of FW" problem can also be reproduced on init if you
unload/reload nouveau.ko several times. This patch makes the AGP code
reset FW on init.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: unwind on load errors

nouveau_load() just returned directly if there was an error instead of
releasing resources.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Don't pass misaligned offsets to io_mapping_map_atomic_wc().

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix the INIT_CONFIGURE_PREINIT BIOS opcode.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Ack the context switch interrupt before switching contexts.

Leaving the IRQ unack'ed while switching contexts makes the switch
fail randomly on some nv1x.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv10: Fix up switching of NV10TCL_DMA_VTXBUF.

Not very nice, but I don't think there's a simpler workaround.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv17-nv4x: Attempt to init some external TMDS transmitters.

sil164 and friends are the most common, usually they just need to be
poked once because a fixed configuration is enough for any modes and
clocks, so they worked without this patch if the BIOS had done a good
job on POST. Display couldn't survive a suspend/resume cycle though.
Unfortunately, BIOS scripts are useless here.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: No need to set slave TV encoder configs explicitly.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv30: Workaround dual TMDS brain damage.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nvc0: starting point for GF100 support, everything stubbed

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvc0: allow INIT_GPIO

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvc0: implement memory detection

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvc0: rudimentary instmem support

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvc0: fix evo dma object so we display something

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvc0: implement crtc pll setting

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: implement init table op 0x57, INIT_LTIME

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>

drm/nouveau: implement init table opcodex 0x5e and 0x9a

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>

drm/nvc0: backup bar3 channel on suspend

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: reduce severity of some "error" messages

There's some known configurations where the lack of these tables/scripts
is perfectly normal, reduce visibilty of complaint messages to debug.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Init dcb->or on cards that have no usable DCB table.

We need a valid OR value because there're a few nv17 cards with DCB v1.4.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv04: Fix up SGRAM density detection.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv30: Fix PFB init for nv31.

Fixes a regression introduced by 58bbb63720c8997e0136fe1884101e7ca40d68fd
(fdo bug 29324).

Reported-by: Johannes Obermayr <johannesobermayr@gmx.de>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix DCB TMDS config parsing.

Thinko caused by 43bda05428a3d2021f3c12220073e0251c65df8b.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nvc0: fix typo in PRAMIN flush

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Don't try DDC on the dummy I2C channel.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: fix minor thinko from nvc0 changes

drm/nouveau: check for error when allocating/mapping dummy page

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove warning about unknown tmds table revisions

This message is apparently confusing people, and is being blamed for some
modesetting issues.  Lets remove the message, and instead replace it
with an unconditional printout of the table revision.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: punt some more log messages to debug level

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50-nvc0: ramht_size is meant to be in bytes, not entries

Fixes an infinite loop that can happen in RAMHT lookup.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Add TV-out quirk for an MSI nForce2 IGP.

The blob also thinks there's a TV connected, so hardware bug...

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Workaround missing GPIO tables on an Apple iMac G4 NV18.

This should fix the reported TV-out load detection false positives
(fdo bug 29455).

Reported-by: Vlado Plaga <rechner@vlado-do.de>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nvc0: fix thinko in instmem suspend/resume

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: calculate vram reordering block size

Will be used at a later point when we plug in an alternative VRAM memory
manager for GeForce 8+ boards.

Based on pscnv code to do the same.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>

drm/nv50: add dcb type 14 to enum to prevent compiler complaint

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Use a helper function to match PCI device/subsystem IDs.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv30: Apply modesetting to the correct slave encoder

Signed-off-by: Patrice Mandin <patmandin@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix backlight control on PPC machines with an internal TMDS panel.

This commit fixes fdo bug 29685.

Reported-by: Vlado Plaga <rechner@vlado-do.de>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix TMDS on some DCB1.5 boards.

The TMDS output of an nv11 was being detected as LVDS, because it uses
DCB type 2 for TMDS instead of type 4.

Reported-by: Bertrand VIEILLE <Vieille.Bertrand@free.fr>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv20: Don't use pushbuf calls on the original nv20.

The "return" command is buggy on the original nv20, it jumps back to
the caller address as expected, but it doesn't clear the subroutine
active bit making the subsequent pushbuf calls fail with a "stack"
overflow.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Fix suspend on some nv4x AGP cards.

On some nv4x cards (specifically, the ones that use an internal
PCIE->AGP bridge) the AGP controller state isn't preserved after a
suspend/resume cycle, and the AGP control registers have moved from
0x18xx to 0x100xx, so the FW check in nouveau_mem_reset_agp() doesn't
quite work. Check "dev->agp->mode" instead.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv20: Use the nv30 CRTC bandwidth calculation code.

nv2x CRTC FIFOs are as large as in nv3x (4kB it seems), and the FIFO
control registers have the same layout: we can make them share the
same implementation.

Previously we were using the nv1x code, but the calculated FIFO
watermarks are usually too low for nv2x and they cause horrible
scanout artifacts. They've gone unnoticed until now because we've been
leaving one of the bandwidth regs uninitialized (CRE 47, which
contains the most significant bits of FFLWM), so everything seemed to
work fine except in some cases after a cold boot, depending on the
memory bandwidth and pixel clocks used.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: add new accelerated bo move funtion

Hopefully this one will be better able to cope with moving tiled buffers
around without getting them all scrambled as a result.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: move check for no-op bo move before memcpy fallback

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove second map of notifier bo

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: require explicit unmap of kmapped bos

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv17-nv4x: Fix analog load detection false positive on rare occasions.

On some boards the residual current DAC outputs can draw when they're
disconnected can be high enough to give a false load detection
positive (I've only seen it in the S-video luma output of some cards,
but just to be sure). The output line capacitance is limited and
sampling twice should fix it reliably.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv40: Try to set up CRE_LCD even if it has unknown bits set.

They don't seem to do anything useful, and we really want to program
CRE_LCD if we aren't lucky enough to find the right CRTC binding
already set.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: have nv_mask return original register value

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: initialize ramht_refs list for faked 0 channel

We need it for PFIFO_INTR_CACHE_ERROR interrupt handling,
because nouveau_fifo_swmthd looks for matching gpuobj in
ramht_refs list.
It fixes kernel panic in nouveau_gpuobj_ref_find.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: move ramht code out of nouveau_object.c, nothing to see here

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: modify object accessors, offset in bytes rather than dwords

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: demagic grctx, and add NVAF support

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>

drm/nv50: move vm trap to nv50_fb.c

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: report BAR access faults

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: rebase per-channel pramin heap offsets to 0

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove nouveau_gpuobj_ref completely, replace with sanity

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: simplify fake gpu objects

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: allow gpuobjs that aren't mapped into aperture

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: rework init ordering so nv50_instmem.c can be less bad

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: tidy ram{ht,fc,ro} a bit

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: add spinlock around ramht modifications

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix gpuobj refcount to use atomics

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: protect gpuobj list + global instmem heap with spinlock

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: remove nouveau_gpuobj_late_takedown

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: protect ramht_find() from oopsing if on channel without ramht

This doesn't actually happen now, but there's a test case for an earlier
kernel where a GPU error is signalled on one of nv50's fake channels, and
the ramht lookup by the IRQ handler triggered an oops.

This adds a check for RAMHT's existance on a channel before looking up
an object handle.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix SOR count for early chipsets

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Break some long lines in the TV-out code.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Don't remove ramht entries from the neighboring channels.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Don't enable AGP FW on nv18.

FW seems to be broken on nv18, it causes random lockups and breaks
suspend/resume even with the blob.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Add module parameter to override the default AGP rate.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: PRAMIN is available from the start on pre-nv50.

This makes sure that RAMHT is cleared correctly on start up.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Remove implicit argument from nv_wait().

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: Simplify tile region handling.

Instead of emptying the caches to avoid a race with the PFIFO puller,
go straight ahead and try to recover from it when it happens. Also,
kill pfifo->cache_flush and tile->lock, we don't need them anymore.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: handle fifo pusher errors better

The most important part of this change is that we now instruct PFIFO to
drop all pending fetches, rather than attempting to skip a single dword
and hope that things would magically sort themselves out - they usually
don't, and we end up with PFIFO being completely hung.

This commit also adds somewhat more useful logging when these exceptions
occur.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

fix compile error due to upstream differences

drm/nouveau: we can't free ACPI EDID, so make a copy that we can

The rest of the connector code assumes we can kfree() the EDID pointer.
This causes things to blow up with the ACPI EDID pointer we get
passed.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: mark PCIEGART pages non-present rather than using dummy page

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: zero dummy page

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fix 100c90 write on nva3

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Fix build regression, undefined reference to `acpi_video_get_edid'

Build breakage:

drivers/built-in.o: In function `nouveau_acpi_edid':
(.text+0x13404e): undefined reference to `acpi_video_get_edid'
make: *** [.tmp_vmlinux1] Error 1

Introduced by:

a6ed76d7ffc62ffa474b41d31b011b6853c5de32 is the first bad commit
commit a6ed76d7ffc62ffa474b41d31b011b6853c5de32
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Mon Jul 12 15:33:07 2010 +1000

    drm/nouveau: support fetching LVDS EDID from ACPI

    Based on a patch from Matthew Garrett.

    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
    Acked-by: Matthew Garrett <mjg@redhat.com>

It doesn't seem to revert cleanly, but the problem lies in these
two config entries:

CONFIG_ACPI=y
CONFIG_ACPI_VIDEO=m

Adding a select for ACPI_VIDEO appears to be the best solution, and
is comparable to what is done in DRM_I915.  Builds, boots, and appears to
work correctly.

Signed-off-by: Philip J. Turmel <philip@turmel.org>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: flush bar1 vm / dma object setup before poking 0x1708

Should fix issues noticed on NVAC (MacBook Pro / ION) since gpuobj
rework.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: correct INIT_DP_CONDITION subcondition 5

Fixes DP output on a GTX 465 board I have.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: add debugfs file to forcibly evict everything from vram

Very useful for debugging buffer migration issues.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: assume smaller tiles for bo moves

Somehow fixes some corruption seen in KDE..

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix panels using straps-based mode detection

nouveau_bios_fp_mode() zeroes the mode struct before filling in relevant
entries.  This nukes the mode id initialised by drm_mode_create(), and
causes warnings from idr when we try to remove the mode.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv10: Don't oops if the card wants to switch to a channel with no grctx.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: enable enhanced framing only if DP display supports it

Reported-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix typo in c2aa91afea5f7e7ae4530fabd37414a79c03328c

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix required mode bandwidth calculation for DP

This should fix eDP on certain laptops with 18-bit panels, we were rejecting
the panel's native mode due to thinking there was insufficient bandwidth
for it.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv30-nv40: Fix postdivider mask when writing engine/memory PLLs.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv0x-nv4x: Leave the 0x40 bit untouched when changing CRE_LCD.

It's an unrelated PLL filtering control bit, leave it alone when
changing the CRTC-encoder binding.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nv50: prevent (IB_PUT == IB_GET) for occurring unless idle

Should fix a DMA race condition I've never seen myself, but could be
the culprit in some random hangs that have been reported.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Add a module option to force card POST.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Try to fetch an EDID from OF if DDC fails.

More Apple brain damage, it fixes the modesetting failure on an eMac
G4 (fdo bug 29810).

Reported-by: Zoltan Varnagy <doi@freemail.hu>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>

drm/nouveau: better handling of unmappable vram

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix chipset vs card_type thinko

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
---
 drivers/gpu/drm/drm_crtc_helper.c           |   22 +-
 drivers/gpu/drm/i2c/ch7006_drv.c            |   22 +-
 drivers/gpu/drm/i2c/ch7006_priv.h           |    2 +-
 drivers/gpu/drm/nouveau/Kconfig             |    1 +
 drivers/gpu/drm/nouveau/Makefile            |   12 +-
 drivers/gpu/drm/nouveau/nouveau_acpi.c      |   38 +-
 drivers/gpu/drm/nouveau/nouveau_bios.c      |  914 ++++++--
 drivers/gpu/drm/nouveau/nouveau_bios.h      |    6 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c        |  251 ++-
 drivers/gpu/drm/nouveau/nouveau_calc.c      |   10 +-
 drivers/gpu/drm/nouveau/nouveau_channel.c   |   18 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c |  473 ++--
 drivers/gpu/drm/nouveau/nouveau_connector.h |   10 +-
 drivers/gpu/drm/nouveau/nouveau_debugfs.c   |   16 +
 drivers/gpu/drm/nouveau/nouveau_dma.c       |   23 +-
 drivers/gpu/drm/nouveau/nouveau_dp.c        |  138 +-
 drivers/gpu/drm/nouveau/nouveau_drv.c       |   49 +-
 drivers/gpu/drm/nouveau/nouveau_drv.h       |  310 ++--
 drivers/gpu/drm/nouveau/nouveau_encoder.h   |   17 +-
 drivers/gpu/drm/nouveau/nouveau_fbcon.c     |    4 +-
 drivers/gpu/drm/nouveau/nouveau_fence.c     |   35 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c       |   15 +-
 drivers/gpu/drm/nouveau/nouveau_grctx.c     |  160 --
 drivers/gpu/drm/nouveau/nouveau_grctx.h     |    2 +-
 drivers/gpu/drm/nouveau/nouveau_hw.c        |   15 +-
 drivers/gpu/drm/nouveau/nouveau_i2c.c       |   83 +-
 drivers/gpu/drm/nouveau/nouveau_i2c.h       |   11 +-
 drivers/gpu/drm/nouveau/nouveau_irq.c       |  129 +-
 drivers/gpu/drm/nouveau/nouveau_mem.c       |  529 ++---
 drivers/gpu/drm/nouveau/nouveau_notifier.c  |   37 +-
 drivers/gpu/drm/nouveau/nouveau_object.c    |  853 +++-----
 drivers/gpu/drm/nouveau/nouveau_ramht.c     |  289 +++
 drivers/gpu/drm/nouveau/nouveau_ramht.h     |   55 +
 drivers/gpu/drm/nouveau/nouveau_reg.h       |  118 +-
 drivers/gpu/drm/nouveau/nouveau_sgdma.c     |  118 +-
 drivers/gpu/drm/nouveau/nouveau_state.c     |  398 ++--
 drivers/gpu/drm/nouveau/nv04_crtc.c         |   11 +-
 drivers/gpu/drm/nouveau/nv04_dac.c          |   61 +-
 drivers/gpu/drm/nouveau/nv04_dfp.c          |  147 +-
 drivers/gpu/drm/nouveau/nv04_display.c      |   90 +-
 drivers/gpu/drm/nouveau/nv04_fbcon.c        |    9 +-
 drivers/gpu/drm/nouveau/nv04_fifo.c         |   88 +-
 drivers/gpu/drm/nouveau/nv04_graph.c        |    5 +-
 drivers/gpu/drm/nouveau/nv04_instmem.c      |  167 +-
 drivers/gpu/drm/nouveau/nv04_mc.c           |    4 +
 drivers/gpu/drm/nouveau/nv04_tv.c           |  139 +-
 drivers/gpu/drm/nouveau/nv10_fifo.c         |   29 +-
 drivers/gpu/drm/nouveau/nv10_gpio.c         |   92 +
 drivers/gpu/drm/nouveau/nv10_graph.c        |  177 +-
 drivers/gpu/drm/nouveau/nv17_gpio.c         |   92 -
 drivers/gpu/drm/nouveau/nv17_tv.c           |  179 +-
 drivers/gpu/drm/nouveau/nv17_tv.h           |   15 +-
 drivers/gpu/drm/nouveau/nv17_tv_modes.c     |   48 +-
 drivers/gpu/drm/nouveau/nv20_graph.c        |  576 +++---
 drivers/gpu/drm/nouveau/nv30_fb.c           |   95 +
 drivers/gpu/drm/nouveau/nv40_fifo.c         |   28 +-
 drivers/gpu/drm/nouveau/nv40_graph.c        |   72 +-
 drivers/gpu/drm/nouveau/nv40_grctx.c        |    6 +-
 drivers/gpu/drm/nouveau/nv40_mc.c           |    2 +-
 drivers/gpu/drm/nouveau/nv50_crtc.c         |   67 +-
 drivers/gpu/drm/nouveau/nv50_cursor.c       |    2 +-
 drivers/gpu/drm/nouveau/nv50_dac.c          |   47 +-
 drivers/gpu/drm/nouveau/nv50_display.c      |  496 +++--
 drivers/gpu/drm/nouveau/nv50_display.h      |    6 +-
 drivers/gpu/drm/nouveau/nv50_fb.c           |   40 +
 drivers/gpu/drm/nouveau/nv50_fbcon.c        |    4 +-
 drivers/gpu/drm/nouveau/nv50_fifo.c         |  396 ++--
 drivers/gpu/drm/nouveau/nv50_gpio.c         |   35 +
 drivers/gpu/drm/nouveau/nv50_graph.c        |  131 +-
 drivers/gpu/drm/nouveau/nv50_grctx.c        | 3305 +++++++++++++++++----------
 drivers/gpu/drm/nouveau/nv50_instmem.c      |  471 ++---
 drivers/gpu/drm/nouveau/nv50_sor.c          |  109 +-
 drivers/gpu/drm/nouveau/nvc0_fb.c           |   38 +
 drivers/gpu/drm/nouveau/nvc0_fifo.c         |   89 +
 drivers/gpu/drm/nouveau/nvc0_graph.c        |   74 +
 drivers/gpu/drm/nouveau/nvc0_instmem.c      |  229 ++
 drivers/gpu/drm/nouveau/nvreg.h             |   23 +-
 77 files changed, 7584 insertions(+), 5293 deletions(-)
 delete mode 100644 drivers/gpu/drm/nouveau/nouveau_grctx.c
 create mode 100644 drivers/gpu/drm/nouveau/nouveau_ramht.c
 create mode 100644 drivers/gpu/drm/nouveau/nouveau_ramht.h
 create mode 100644 drivers/gpu/drm/nouveau/nv10_gpio.c
 delete mode 100644 drivers/gpu/drm/nouveau/nv17_gpio.c
 create mode 100644 drivers/gpu/drm/nouveau/nv30_fb.c
 create mode 100644 drivers/gpu/drm/nouveau/nvc0_fb.c
 create mode 100644 drivers/gpu/drm/nouveau/nvc0_fifo.c
 create mode 100644 drivers/gpu/drm/nouveau/nvc0_graph.c
 create mode 100644 drivers/gpu/drm/nouveau/nvc0_instmem.c

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 9b2a541..1eaa315 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -201,6 +201,17 @@ bool drm_helper_crtc_in_use(struct drm_crtc *crtc)
 }
 EXPORT_SYMBOL(drm_helper_crtc_in_use);
 
+static void
+drm_encoder_disable(struct drm_encoder *encoder)
+{
+	struct drm_encoder_helper_funcs *encoder_funcs = encoder->helper_private;
+
+	if (encoder_funcs->disable)
+		(*encoder_funcs->disable)(encoder);
+	else
+		(*encoder_funcs->dpms)(encoder, DRM_MODE_DPMS_OFF);
+}
+
 /**
  * drm_helper_disable_unused_functions - disable unused objects
  * @dev: DRM device
@@ -215,7 +226,6 @@ void drm_helper_disable_unused_functions(struct drm_device *dev)
 {
 	struct drm_encoder *encoder;
 	struct drm_connector *connector;
-	struct drm_encoder_helper_funcs *encoder_funcs;
 	struct drm_crtc *crtc;
 
 	list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
@@ -226,12 +236,8 @@ void drm_helper_disable_unused_functions(struct drm_device *dev)
 	}
 
 	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
-		encoder_funcs = encoder->helper_private;
 		if (!drm_helper_encoder_in_use(encoder)) {
-			if (encoder_funcs->disable)
-				(*encoder_funcs->disable)(encoder);
-			else
-				(*encoder_funcs->dpms)(encoder, DRM_MODE_DPMS_OFF);
+			drm_encoder_disable(encoder);
 			/* disconnector encoder from any connector */
 			encoder->crtc = NULL;
 		}
@@ -292,11 +298,11 @@ drm_crtc_prepare_encoders(struct drm_device *dev)
 		encoder_funcs = encoder->helper_private;
 		/* Disable unused encoders */
 		if (encoder->crtc == NULL)
-			(*encoder_funcs->dpms)(encoder, DRM_MODE_DPMS_OFF);
+			drm_encoder_disable(encoder);
 		/* Disable encoders whose CRTC is about to change */
 		if (encoder_funcs->get_crtc &&
 		    encoder->crtc != (*encoder_funcs->get_crtc)(encoder))
-			(*encoder_funcs->dpms)(encoder, DRM_MODE_DPMS_OFF);
+			drm_encoder_disable(encoder);
 	}
 }
 
diff --git a/drivers/gpu/drm/i2c/ch7006_drv.c b/drivers/gpu/drm/i2c/ch7006_drv.c
index 8c760c7..08792a7 100644
--- a/drivers/gpu/drm/i2c/ch7006_drv.c
+++ b/drivers/gpu/drm/i2c/ch7006_drv.c
@@ -33,7 +33,7 @@ static void ch7006_encoder_set_config(struct drm_encoder *encoder,
 {
 	struct ch7006_priv *priv = to_ch7006_priv(encoder);
 
-	priv->params = params;
+	priv->params = *(struct ch7006_encoder_params *)params;
 }
 
 static void ch7006_encoder_destroy(struct drm_encoder *encoder)
@@ -114,7 +114,7 @@ static void ch7006_encoder_mode_set(struct drm_encoder *encoder,
 {
 	struct i2c_client *client = drm_i2c_encoder_get_client(encoder);
 	struct ch7006_priv *priv = to_ch7006_priv(encoder);
-	struct ch7006_encoder_params *params = priv->params;
+	struct ch7006_encoder_params *params = &priv->params;
 	struct ch7006_state *state = &priv->state;
 	uint8_t *regs = state->regs;
 	struct ch7006_mode *mode = priv->mode;
@@ -428,6 +428,22 @@ static int ch7006_remove(struct i2c_client *client)
 	return 0;
 }
 
+static int ch7006_suspend(struct i2c_client *client, pm_message_t mesg)
+{
+	ch7006_dbg(client, "\n");
+
+	return 0;
+}
+
+static int ch7006_resume(struct i2c_client *client)
+{
+	ch7006_dbg(client, "\n");
+
+	ch7006_write(client, 0x3d, 0x0);
+
+	return 0;
+}
+
 static int ch7006_encoder_init(struct i2c_client *client,
 			       struct drm_device *dev,
 			       struct drm_encoder_slave *encoder)
@@ -488,6 +504,8 @@ static struct drm_i2c_encoder_driver ch7006_driver = {
 	.i2c_driver = {
 		.probe = ch7006_probe,
 		.remove = ch7006_remove,
+		.suspend = ch7006_suspend,
+		.resume = ch7006_resume,
 
 		.driver = {
 			.name = "ch7006",
diff --git a/drivers/gpu/drm/i2c/ch7006_priv.h b/drivers/gpu/drm/i2c/ch7006_priv.h
index 9487123..17667b7 100644
--- a/drivers/gpu/drm/i2c/ch7006_priv.h
+++ b/drivers/gpu/drm/i2c/ch7006_priv.h
@@ -77,7 +77,7 @@ struct ch7006_state {
 };
 
 struct ch7006_priv {
-	struct ch7006_encoder_params *params;
+	struct ch7006_encoder_params params;
 	struct ch7006_mode *mode;
 
 	struct ch7006_state state;
diff --git a/drivers/gpu/drm/nouveau/Kconfig b/drivers/gpu/drm/nouveau/Kconfig
index 6b8967a..15ca435 100644
--- a/drivers/gpu/drm/nouveau/Kconfig
+++ b/drivers/gpu/drm/nouveau/Kconfig
@@ -10,6 +10,7 @@ config DRM_NOUVEAU
 	select FB
 	select FRAMEBUFFER_CONSOLE if !EMBEDDED
 	select FB_BACKLIGHT if DRM_NOUVEAU_BACKLIGHT
+	select ACPI_VIDEO if ACPI
 	help
 	  Choose this option for open-source nVidia support.
 
diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index acd31ed..d6cfbf2 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -9,20 +9,20 @@ nouveau-y := nouveau_drv.o nouveau_state.o nouveau_channel.o nouveau_mem.o \
              nouveau_bo.o nouveau_fence.o nouveau_gem.o nouveau_ttm.o \
              nouveau_hw.o nouveau_calc.o nouveau_bios.o nouveau_i2c.o \
              nouveau_display.o nouveau_connector.o nouveau_fbcon.o \
-             nouveau_dp.o nouveau_grctx.o \
+             nouveau_dp.o nouveau_ramht.o \
              nv04_timer.o \
              nv04_mc.o nv40_mc.o nv50_mc.o \
-             nv04_fb.o nv10_fb.o nv40_fb.o nv50_fb.o \
-             nv04_fifo.o nv10_fifo.o nv40_fifo.o nv50_fifo.o \
+             nv04_fb.o nv10_fb.o nv30_fb.o nv40_fb.o nv50_fb.o nvc0_fb.o \
+             nv04_fifo.o nv10_fifo.o nv40_fifo.o nv50_fifo.o nvc0_fifo.o \
              nv04_graph.o nv10_graph.o nv20_graph.o \
-             nv40_graph.o nv50_graph.o \
+             nv40_graph.o nv50_graph.o nvc0_graph.o \
              nv40_grctx.o nv50_grctx.o \
-             nv04_instmem.o nv50_instmem.o \
+             nv04_instmem.o nv50_instmem.o nvc0_instmem.o \
              nv50_crtc.o nv50_dac.o nv50_sor.o \
              nv50_cursor.o nv50_display.o nv50_fbcon.o \
              nv04_dac.o nv04_dfp.o nv04_tv.o nv17_tv.o nv17_tv_modes.o \
              nv04_crtc.o nv04_display.o nv04_cursor.o nv04_fbcon.o \
-             nv17_gpio.o nv50_gpio.o \
+             nv10_gpio.o nv50_gpio.o \
 	     nv50_calc.o
 
 nouveau-$(CONFIG_DRM_NOUVEAU_DEBUG) += nouveau_debugfs.o
diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index d4bcca8..1191526 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <acpi/acpi_drivers.h>
 #include <acpi/acpi_bus.h>
+#include <acpi/video.h>
 
 #include "drmP.h"
 #include "drm.h"
@@ -11,6 +12,7 @@
 #include "nouveau_drv.h"
 #include "nouveau_drm.h"
 #include "nv50_display.h"
+#include "nouveau_connector.h"
 
 #include <linux/vga_switcheroo.h>
 
@@ -42,7 +44,7 @@ static const char nouveau_dsm_muid[] = {
 	0xB3, 0x4D, 0x7E, 0x5F, 0xEA, 0x12, 0x9F, 0xD4,
 };
 
-static int nouveau_dsm(acpi_handle handle, int func, int arg, int *result)
+static int nouveau_dsm(acpi_handle handle, int func, int arg, uint32_t *result)
 {
 	struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_object_list input;
@@ -259,3 +261,37 @@ int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len)
 {
 	return nouveau_rom_call(nouveau_dsm_priv.rom_handle, bios, offset, len);
 }
+
+int
+nouveau_acpi_edid(struct drm_device *dev, struct drm_connector *connector)
+{
+	struct nouveau_connector *nv_connector = nouveau_connector(connector);
+	struct acpi_device *acpidev;
+	acpi_handle handle;
+	int type, ret;
+	void *edid;
+
+	switch (connector->connector_type) {
+	case DRM_MODE_CONNECTOR_LVDS:
+	case DRM_MODE_CONNECTOR_eDP:
+		type = ACPI_VIDEO_DISPLAY_LCD;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	handle = DEVICE_ACPI_HANDLE(&dev->pdev->dev);
+	if (!handle)
+		return -ENODEV;
+
+	ret = acpi_bus_get_device(handle, &acpidev);
+	if (ret)
+		return -ENODEV;
+
+	ret = acpi_video_get_edid(acpidev, type, -1, &edid);
+	if (ret < 0)
+		return ret;
+
+	nv_connector->edid = kmemdup(edid, EDID_LENGTH, GFP_KERNEL);
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c b/drivers/gpu/drm/nouveau/nouveau_bios.c
index e492919..72905c9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.c
@@ -28,6 +28,8 @@
 #include "nouveau_hw.h"
 #include "nouveau_encoder.h"
 
+#include <linux/io-mapping.h>
+
 /* these defines are made up */
 #define NV_CIO_CRE_44_HEADA 0x0
 #define NV_CIO_CRE_44_HEADB 0x3
@@ -209,20 +211,20 @@ static struct methods shadow_methods[] = {
 	{ "PCIROM", load_vbios_pci, true },
 	{ "ACPI", load_vbios_acpi, true },
 };
+#define NUM_SHADOW_METHODS ARRAY_SIZE(shadow_methods)
 
 static bool NVShadowVBIOS(struct drm_device *dev, uint8_t *data)
 {
-	const int nr_methods = ARRAY_SIZE(shadow_methods);
 	struct methods *methods = shadow_methods;
 	int testscore = 3;
-	int scores[nr_methods], i;
+	int scores[NUM_SHADOW_METHODS], i;
 
 	if (nouveau_vbios) {
-		for (i = 0; i < nr_methods; i++)
+		for (i = 0; i < NUM_SHADOW_METHODS; i++)
 			if (!strcasecmp(nouveau_vbios, methods[i].desc))
 				break;
 
-		if (i < nr_methods) {
+		if (i < NUM_SHADOW_METHODS) {
 			NV_INFO(dev, "Attempting to use BIOS image from %s\n",
 				methods[i].desc);
 
@@ -234,7 +236,7 @@ static bool NVShadowVBIOS(struct drm_device *dev, uint8_t *data)
 		NV_ERROR(dev, "VBIOS source \'%s\' invalid\n", nouveau_vbios);
 	}
 
-	for (i = 0; i < nr_methods; i++) {
+	for (i = 0; i < NUM_SHADOW_METHODS; i++) {
 		NV_TRACE(dev, "Attempting to load BIOS image from %s\n",
 			 methods[i].desc);
 		data[0] = data[1] = 0;	/* avoid reuse of previous image */
@@ -245,7 +247,7 @@ static bool NVShadowVBIOS(struct drm_device *dev, uint8_t *data)
 	}
 
 	while (--testscore > 0) {
-		for (i = 0; i < nr_methods; i++) {
+		for (i = 0; i < NUM_SHADOW_METHODS; i++) {
 			if (scores[i] == testscore) {
 				NV_TRACE(dev, "Using BIOS image from %s\n",
 					 methods[i].desc);
@@ -920,7 +922,7 @@ init_io_restrict_prog(struct nvbios *bios, uint16_t offset,
 		NV_ERROR(bios->dev,
 			 "0x%04X: Config 0x%02X exceeds maximal bound 0x%02X\n",
 			 offset, config, count);
-		return -EINVAL;
+		return len;
 	}
 
 	configval = ROM32(bios->data[offset + 11 + config * 4]);
@@ -1022,7 +1024,7 @@ init_io_restrict_pll(struct nvbios *bios, uint16_t offset,
 		NV_ERROR(bios->dev,
 			 "0x%04X: Config 0x%02X exceeds maximal bound 0x%02X\n",
 			 offset, config, count);
-		return -EINVAL;
+		return len;
 	}
 
 	freq = ROM16(bios->data[offset + 12 + config * 2]);
@@ -1194,7 +1196,7 @@ init_dp_condition(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	dpe = nouveau_bios_dp_table(dev, dcb, &dummy);
 	if (!dpe) {
 		NV_ERROR(dev, "0x%04X: INIT_3A: no encoder table!!\n", offset);
-		return -EINVAL;
+		return 3;
 	}
 
 	switch (cond) {
@@ -1218,14 +1220,18 @@ init_dp_condition(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		int ret;
 
 		auxch = nouveau_i2c_find(dev, bios->display.output->i2c_index);
-		if (!auxch)
-			return -ENODEV;
+		if (!auxch) {
+			NV_ERROR(dev, "0x%04X: couldn't get auxch\n", offset);
+			return 3;
+		}
 
 		ret = nouveau_dp_auxch(auxch, 9, 0xd, &cond, 1);
-		if (ret)
-			return ret;
+		if (ret) {
+			NV_ERROR(dev, "0x%04X: auxch rd fail: %d\n", offset, ret);
+			return 3;
+		}
 
-		if (cond & 1)
+		if (!(cond & 1))
 			iexec->execute = false;
 	}
 		break;
@@ -1392,7 +1398,7 @@ init_io_restrict_pll2(struct nvbios *bios, uint16_t offset,
 		NV_ERROR(bios->dev,
 			 "0x%04X: Config 0x%02X exceeds maximal bound 0x%02X\n",
 			 offset, config, count);
-		return -EINVAL;
+		return len;
 	}
 
 	freq = ROM32(bios->data[offset + 11 + config * 4]);
@@ -1452,6 +1458,7 @@ init_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 * "mask n" and OR it with "data n" before writing it back to the device
 	 */
 
+	struct drm_device *dev = bios->dev;
 	uint8_t i2c_index = bios->data[offset + 1];
 	uint8_t i2c_address = bios->data[offset + 2] >> 1;
 	uint8_t count = bios->data[offset + 3];
@@ -1466,9 +1473,11 @@ init_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		      "Count: 0x%02X\n",
 		offset, i2c_index, i2c_address, count);
 
-	chan = init_i2c_device_find(bios->dev, i2c_index);
-	if (!chan)
-		return -ENODEV;
+	chan = init_i2c_device_find(dev, i2c_index);
+	if (!chan) {
+		NV_ERROR(dev, "0x%04X: i2c bus not found\n", offset);
+		return len;
+	}
 
 	for (i = 0; i < count; i++) {
 		uint8_t reg = bios->data[offset + 4 + i * 3];
@@ -1479,8 +1488,10 @@ init_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = i2c_smbus_xfer(&chan->adapter, i2c_address, 0,
 				     I2C_SMBUS_READ, reg,
 				     I2C_SMBUS_BYTE_DATA, &val);
-		if (ret < 0)
-			return ret;
+		if (ret < 0) {
+			NV_ERROR(dev, "0x%04X: i2c rd fail: %d\n", offset, ret);
+			return len;
+		}
 
 		BIOSLOG(bios, "0x%04X: I2CReg: 0x%02X, Value: 0x%02X, "
 			      "Mask: 0x%02X, Data: 0x%02X\n",
@@ -1494,8 +1505,10 @@ init_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = i2c_smbus_xfer(&chan->adapter, i2c_address, 0,
 				     I2C_SMBUS_WRITE, reg,
 				     I2C_SMBUS_BYTE_DATA, &val);
-		if (ret < 0)
-			return ret;
+		if (ret < 0) {
+			NV_ERROR(dev, "0x%04X: i2c wr fail: %d\n", offset, ret);
+			return len;
+		}
 	}
 
 	return len;
@@ -1520,6 +1533,7 @@ init_zm_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 * "DCB I2C table entry index", set the register to "data n"
 	 */
 
+	struct drm_device *dev = bios->dev;
 	uint8_t i2c_index = bios->data[offset + 1];
 	uint8_t i2c_address = bios->data[offset + 2] >> 1;
 	uint8_t count = bios->data[offset + 3];
@@ -1534,9 +1548,11 @@ init_zm_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		      "Count: 0x%02X\n",
 		offset, i2c_index, i2c_address, count);
 
-	chan = init_i2c_device_find(bios->dev, i2c_index);
-	if (!chan)
-		return -ENODEV;
+	chan = init_i2c_device_find(dev, i2c_index);
+	if (!chan) {
+		NV_ERROR(dev, "0x%04X: i2c bus not found\n", offset);
+		return len;
+	}
 
 	for (i = 0; i < count; i++) {
 		uint8_t reg = bios->data[offset + 4 + i * 2];
@@ -1553,8 +1569,10 @@ init_zm_i2c_byte(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = i2c_smbus_xfer(&chan->adapter, i2c_address, 0,
 				     I2C_SMBUS_WRITE, reg,
 				     I2C_SMBUS_BYTE_DATA, &val);
-		if (ret < 0)
-			return ret;
+		if (ret < 0) {
+			NV_ERROR(dev, "0x%04X: i2c wr fail: %d\n", offset, ret);
+			return len;
+		}
 	}
 
 	return len;
@@ -1577,6 +1595,7 @@ init_zm_i2c(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 * address" on the I2C bus given by "DCB I2C table entry index"
 	 */
 
+	struct drm_device *dev = bios->dev;
 	uint8_t i2c_index = bios->data[offset + 1];
 	uint8_t i2c_address = bios->data[offset + 2] >> 1;
 	uint8_t count = bios->data[offset + 3];
@@ -1584,7 +1603,7 @@ init_zm_i2c(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	struct nouveau_i2c_chan *chan;
 	struct i2c_msg msg;
 	uint8_t data[256];
-	int i;
+	int ret, i;
 
 	if (!iexec->execute)
 		return len;
@@ -1593,9 +1612,11 @@ init_zm_i2c(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		      "Count: 0x%02X\n",
 		offset, i2c_index, i2c_address, count);
 
-	chan = init_i2c_device_find(bios->dev, i2c_index);
-	if (!chan)
-		return -ENODEV;
+	chan = init_i2c_device_find(dev, i2c_index);
+	if (!chan) {
+		NV_ERROR(dev, "0x%04X: i2c bus not found\n", offset);
+		return len;
+	}
 
 	for (i = 0; i < count; i++) {
 		data[i] = bios->data[offset + 4 + i];
@@ -1608,8 +1629,11 @@ init_zm_i2c(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		msg.flags = 0;
 		msg.len = count;
 		msg.buf = data;
-		if (i2c_transfer(&chan->adapter, &msg, 1) != 1)
-			return -EIO;
+		ret = i2c_transfer(&chan->adapter, &msg, 1);
+		if (ret != 1) {
+			NV_ERROR(dev, "0x%04X: i2c wr fail: %d\n", offset, ret);
+			return len;
+		}
 	}
 
 	return len;
@@ -1633,6 +1657,7 @@ init_tmds(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 * used -- see get_tmds_index_reg()
 	 */
 
+	struct drm_device *dev = bios->dev;
 	uint8_t mlv = bios->data[offset + 1];
 	uint32_t tmdsaddr = bios->data[offset + 2];
 	uint8_t mask = bios->data[offset + 3];
@@ -1647,8 +1672,10 @@ init_tmds(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		offset, mlv, tmdsaddr, mask, data);
 
 	reg = get_tmds_index_reg(bios->dev, mlv);
-	if (!reg)
-		return -EINVAL;
+	if (!reg) {
+		NV_ERROR(dev, "0x%04X: no tmds_index_reg\n", offset);
+		return 5;
+	}
 
 	bios_wr32(bios, reg,
 		  tmdsaddr | NV_PRAMDAC_FP_TMDS_CONTROL_WRITE_DISABLE);
@@ -1678,6 +1705,7 @@ init_zm_tmds_group(struct nvbios *bios, uint16_t offset,
 	 * register is used -- see get_tmds_index_reg()
 	 */
 
+	struct drm_device *dev = bios->dev;
 	uint8_t mlv = bios->data[offset + 1];
 	uint8_t count = bios->data[offset + 2];
 	int len = 3 + count * 2;
@@ -1691,8 +1719,10 @@ init_zm_tmds_group(struct nvbios *bios, uint16_t offset,
 		offset, mlv, count);
 
 	reg = get_tmds_index_reg(bios->dev, mlv);
-	if (!reg)
-		return -EINVAL;
+	if (!reg) {
+		NV_ERROR(dev, "0x%04X: no tmds_index_reg\n", offset);
+		return len;
+	}
 
 	for (i = 0; i < count; i++) {
 		uint8_t tmdsaddr = bios->data[offset + 3 + i * 2];
@@ -1898,6 +1928,31 @@ init_condition_time(struct nvbios *bios, uint16_t offset,
 }
 
 static int
+init_ltime(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
+{
+	/*
+	 * INIT_LTIME   opcode: 0x57 ('V')
+	 *
+	 * offset      (8  bit): opcode
+	 * offset + 1  (16 bit): time
+	 *
+	 * Sleep for "time" miliseconds.
+	 */
+
+	unsigned time = ROM16(bios->data[offset + 1]);
+
+	if (!iexec->execute)
+		return 3;
+
+	BIOSLOG(bios, "0x%04X: Sleeping for 0x%04X miliseconds\n",
+		offset, time);
+
+	msleep(time);
+
+	return 3;
+}
+
+static int
 init_zm_reg_sequence(struct nvbios *bios, uint16_t offset,
 		     struct init_exec *iexec)
 {
@@ -1965,6 +2020,64 @@ init_sub_direct(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 }
 
 static int
+init_i2c_if(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
+{
+	/*
+	 * INIT_I2C_IF   opcode: 0x5E ('^')
+	 *
+	 * offset      (8 bit): opcode
+	 * offset + 1  (8 bit): DCB I2C table entry index
+	 * offset + 2  (8 bit): I2C slave address
+	 * offset + 3  (8 bit): I2C register
+	 * offset + 4  (8 bit): mask
+	 * offset + 5  (8 bit): data
+	 *
+	 * Read the register given by "I2C register" on the device addressed
+	 * by "I2C slave address" on the I2C bus given by "DCB I2C table
+	 * entry index". Compare the result AND "mask" to "data".
+	 * If they're not equal, skip subsequent opcodes until condition is
+	 * inverted (INIT_NOT), or we hit INIT_RESUME
+	 */
+
+	uint8_t i2c_index = bios->data[offset + 1];
+	uint8_t i2c_address = bios->data[offset + 2] >> 1;
+	uint8_t reg = bios->data[offset + 3];
+	uint8_t mask = bios->data[offset + 4];
+	uint8_t data = bios->data[offset + 5];
+	struct nouveau_i2c_chan *chan;
+	union i2c_smbus_data val;
+	int ret;
+
+	/* no execute check by design */
+
+	BIOSLOG(bios, "0x%04X: DCBI2CIndex: 0x%02X, I2CAddress: 0x%02X\n",
+		offset, i2c_index, i2c_address);
+
+	chan = init_i2c_device_find(bios->dev, i2c_index);
+	if (!chan)
+		return -ENODEV;
+
+	ret = i2c_smbus_xfer(&chan->adapter, i2c_address, 0,
+			     I2C_SMBUS_READ, reg,
+			     I2C_SMBUS_BYTE_DATA, &val);
+	if (ret < 0) {
+		BIOSLOG(bios, "0x%04X: I2CReg: 0x%02X, Value: [no device], "
+			      "Mask: 0x%02X, Data: 0x%02X\n",
+			offset, reg, mask, data);
+		iexec->execute = 0;
+		return 6;
+	}
+
+	BIOSLOG(bios, "0x%04X: I2CReg: 0x%02X, Value: 0x%02X, "
+		      "Mask: 0x%02X, Data: 0x%02X\n",
+		offset, reg, val.byte, mask, data);
+
+	iexec->execute = ((val.byte & mask) == data);
+
+	return 6;
+}
+
+static int
 init_copy_nv_reg(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 {
 	/*
@@ -2039,6 +2152,325 @@ init_zm_index_io(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	return 5;
 }
 
+static inline void
+bios_md32(struct nvbios *bios, uint32_t reg,
+	  uint32_t mask, uint32_t val)
+{
+	bios_wr32(bios, reg, (bios_rd32(bios, reg) & ~mask) | val);
+}
+
+static uint32_t
+peek_fb(struct drm_device *dev, struct io_mapping *fb,
+	uint32_t off)
+{
+	uint32_t val = 0;
+
+	if (off < pci_resource_len(dev->pdev, 1)) {
+		uint8_t __iomem *p =
+			io_mapping_map_atomic_wc(fb, off & PAGE_MASK);
+
+		val = ioread32(p + (off & ~PAGE_MASK));
+
+		io_mapping_unmap_atomic(p);
+	}
+
+	return val;
+}
+
+static void
+poke_fb(struct drm_device *dev, struct io_mapping *fb,
+	uint32_t off, uint32_t val)
+{
+	if (off < pci_resource_len(dev->pdev, 1)) {
+		uint8_t __iomem *p =
+			io_mapping_map_atomic_wc(fb, off & PAGE_MASK);
+
+		iowrite32(val, p + (off & ~PAGE_MASK));
+		wmb();
+
+		io_mapping_unmap_atomic(p);
+	}
+}
+
+static inline bool
+read_back_fb(struct drm_device *dev, struct io_mapping *fb,
+	     uint32_t off, uint32_t val)
+{
+	poke_fb(dev, fb, off, val);
+	return val == peek_fb(dev, fb, off);
+}
+
+static int
+nv04_init_compute_mem(struct nvbios *bios)
+{
+	struct drm_device *dev = bios->dev;
+	uint32_t patt = 0xdeadbeef;
+	struct io_mapping *fb;
+	int i;
+
+	/* Map the framebuffer aperture */
+	fb = io_mapping_create_wc(pci_resource_start(dev->pdev, 1),
+				  pci_resource_len(dev->pdev, 1));
+	if (!fb)
+		return -ENOMEM;
+
+	/* Sequencer and refresh off */
+	NVWriteVgaSeq(dev, 0, 1, NVReadVgaSeq(dev, 0, 1) | 0x20);
+	bios_md32(bios, NV04_PFB_DEBUG_0, 0, NV04_PFB_DEBUG_0_REFRESH_OFF);
+
+	bios_md32(bios, NV04_PFB_BOOT_0, ~0,
+		  NV04_PFB_BOOT_0_RAM_AMOUNT_16MB |
+		  NV04_PFB_BOOT_0_RAM_WIDTH_128 |
+		  NV04_PFB_BOOT_0_RAM_TYPE_SGRAM_16MBIT);
+
+	for (i = 0; i < 4; i++)
+		poke_fb(dev, fb, 4 * i, patt);
+
+	poke_fb(dev, fb, 0x400000, patt + 1);
+
+	if (peek_fb(dev, fb, 0) == patt + 1) {
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_TYPE,
+			  NV04_PFB_BOOT_0_RAM_TYPE_SDRAM_16MBIT);
+		bios_md32(bios, NV04_PFB_DEBUG_0,
+			  NV04_PFB_DEBUG_0_REFRESH_OFF, 0);
+
+		for (i = 0; i < 4; i++)
+			poke_fb(dev, fb, 4 * i, patt);
+
+		if ((peek_fb(dev, fb, 0xc) & 0xffff) != (patt & 0xffff))
+			bios_md32(bios, NV04_PFB_BOOT_0,
+				  NV04_PFB_BOOT_0_RAM_WIDTH_128 |
+				  NV04_PFB_BOOT_0_RAM_AMOUNT,
+				  NV04_PFB_BOOT_0_RAM_AMOUNT_8MB);
+
+	} else if ((peek_fb(dev, fb, 0xc) & 0xffff0000) !=
+		   (patt & 0xffff0000)) {
+		bios_md32(bios, NV04_PFB_BOOT_0,
+			  NV04_PFB_BOOT_0_RAM_WIDTH_128 |
+			  NV04_PFB_BOOT_0_RAM_AMOUNT,
+			  NV04_PFB_BOOT_0_RAM_AMOUNT_4MB);
+
+	} else if (peek_fb(dev, fb, 0) != patt) {
+		if (read_back_fb(dev, fb, 0x800000, patt))
+			bios_md32(bios, NV04_PFB_BOOT_0,
+				  NV04_PFB_BOOT_0_RAM_AMOUNT,
+				  NV04_PFB_BOOT_0_RAM_AMOUNT_8MB);
+		else
+			bios_md32(bios, NV04_PFB_BOOT_0,
+				  NV04_PFB_BOOT_0_RAM_AMOUNT,
+				  NV04_PFB_BOOT_0_RAM_AMOUNT_4MB);
+
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_TYPE,
+			  NV04_PFB_BOOT_0_RAM_TYPE_SGRAM_8MBIT);
+
+	} else if (!read_back_fb(dev, fb, 0x800000, patt)) {
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_AMOUNT,
+			  NV04_PFB_BOOT_0_RAM_AMOUNT_8MB);
+
+	}
+
+	/* Refresh on, sequencer on */
+	bios_md32(bios, NV04_PFB_DEBUG_0, NV04_PFB_DEBUG_0_REFRESH_OFF, 0);
+	NVWriteVgaSeq(dev, 0, 1, NVReadVgaSeq(dev, 0, 1) & ~0x20);
+
+	io_mapping_free(fb);
+	return 0;
+}
+
+static const uint8_t *
+nv05_memory_config(struct nvbios *bios)
+{
+	/* Defaults for BIOSes lacking a memory config table */
+	static const uint8_t default_config_tab[][2] = {
+		{ 0x24, 0x00 },
+		{ 0x28, 0x00 },
+		{ 0x24, 0x01 },
+		{ 0x1f, 0x00 },
+		{ 0x0f, 0x00 },
+		{ 0x17, 0x00 },
+		{ 0x06, 0x00 },
+		{ 0x00, 0x00 }
+	};
+	int i = (bios_rd32(bios, NV_PEXTDEV_BOOT_0) &
+		 NV_PEXTDEV_BOOT_0_RAMCFG) >> 2;
+
+	if (bios->legacy.mem_init_tbl_ptr)
+		return &bios->data[bios->legacy.mem_init_tbl_ptr + 2 * i];
+	else
+		return default_config_tab[i];
+}
+
+static int
+nv05_init_compute_mem(struct nvbios *bios)
+{
+	struct drm_device *dev = bios->dev;
+	const uint8_t *ramcfg = nv05_memory_config(bios);
+	uint32_t patt = 0xdeadbeef;
+	struct io_mapping *fb;
+	int i, v;
+
+	/* Map the framebuffer aperture */
+	fb = io_mapping_create_wc(pci_resource_start(dev->pdev, 1),
+				  pci_resource_len(dev->pdev, 1));
+	if (!fb)
+		return -ENOMEM;
+
+	/* Sequencer off */
+	NVWriteVgaSeq(dev, 0, 1, NVReadVgaSeq(dev, 0, 1) | 0x20);
+
+	if (bios_rd32(bios, NV04_PFB_BOOT_0) & NV04_PFB_BOOT_0_UMA_ENABLE)
+		goto out;
+
+	bios_md32(bios, NV04_PFB_DEBUG_0, NV04_PFB_DEBUG_0_REFRESH_OFF, 0);
+
+	/* If present load the hardcoded scrambling table */
+	if (bios->legacy.mem_init_tbl_ptr) {
+		uint32_t *scramble_tab = (uint32_t *)&bios->data[
+			bios->legacy.mem_init_tbl_ptr + 0x10];
+
+		for (i = 0; i < 8; i++)
+			bios_wr32(bios, NV04_PFB_SCRAMBLE(i),
+				  ROM32(scramble_tab[i]));
+	}
+
+	/* Set memory type/width/length defaults depending on the straps */
+	bios_md32(bios, NV04_PFB_BOOT_0, 0x3f, ramcfg[0]);
+
+	if (ramcfg[1] & 0x80)
+		bios_md32(bios, NV04_PFB_CFG0, 0, NV04_PFB_CFG0_SCRAMBLE);
+
+	bios_md32(bios, NV04_PFB_CFG1, 0x700001, (ramcfg[1] & 1) << 20);
+	bios_md32(bios, NV04_PFB_CFG1, 0, 1);
+
+	/* Probe memory bus width */
+	for (i = 0; i < 4; i++)
+		poke_fb(dev, fb, 4 * i, patt);
+
+	if (peek_fb(dev, fb, 0xc) != patt)
+		bios_md32(bios, NV04_PFB_BOOT_0,
+			  NV04_PFB_BOOT_0_RAM_WIDTH_128, 0);
+
+	/* Probe memory length */
+	v = bios_rd32(bios, NV04_PFB_BOOT_0) & NV04_PFB_BOOT_0_RAM_AMOUNT;
+
+	if (v == NV04_PFB_BOOT_0_RAM_AMOUNT_32MB &&
+	    (!read_back_fb(dev, fb, 0x1000000, ++patt) ||
+	     !read_back_fb(dev, fb, 0, ++patt)))
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_AMOUNT,
+			  NV04_PFB_BOOT_0_RAM_AMOUNT_16MB);
+
+	if (v == NV04_PFB_BOOT_0_RAM_AMOUNT_16MB &&
+	    !read_back_fb(dev, fb, 0x800000, ++patt))
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_AMOUNT,
+			  NV04_PFB_BOOT_0_RAM_AMOUNT_8MB);
+
+	if (!read_back_fb(dev, fb, 0x400000, ++patt))
+		bios_md32(bios, NV04_PFB_BOOT_0, NV04_PFB_BOOT_0_RAM_AMOUNT,
+			  NV04_PFB_BOOT_0_RAM_AMOUNT_4MB);
+
+out:
+	/* Sequencer on */
+	NVWriteVgaSeq(dev, 0, 1, NVReadVgaSeq(dev, 0, 1) & ~0x20);
+
+	io_mapping_free(fb);
+	return 0;
+}
+
+static int
+nv10_init_compute_mem(struct nvbios *bios)
+{
+	struct drm_device *dev = bios->dev;
+	struct drm_nouveau_private *dev_priv = bios->dev->dev_private;
+	const int mem_width[] = { 0x10, 0x00, 0x20 };
+	const int mem_width_count = (dev_priv->chipset >= 0x17 ? 3 : 2);
+	uint32_t patt = 0xdeadbeef;
+	struct io_mapping *fb;
+	int i, j, k;
+
+	/* Map the framebuffer aperture */
+	fb = io_mapping_create_wc(pci_resource_start(dev->pdev, 1),
+				  pci_resource_len(dev->pdev, 1));
+	if (!fb)
+		return -ENOMEM;
+
+	bios_wr32(bios, NV10_PFB_REFCTRL, NV10_PFB_REFCTRL_VALID_1);
+
+	/* Probe memory bus width */
+	for (i = 0; i < mem_width_count; i++) {
+		bios_md32(bios, NV04_PFB_CFG0, 0x30, mem_width[i]);
+
+		for (j = 0; j < 4; j++) {
+			for (k = 0; k < 4; k++)
+				poke_fb(dev, fb, 0x1c, 0);
+
+			poke_fb(dev, fb, 0x1c, patt);
+			poke_fb(dev, fb, 0x3c, 0);
+
+			if (peek_fb(dev, fb, 0x1c) == patt)
+				goto mem_width_found;
+		}
+	}
+
+mem_width_found:
+	patt <<= 1;
+
+	/* Probe amount of installed memory */
+	for (i = 0; i < 4; i++) {
+		int off = bios_rd32(bios, NV04_PFB_FIFO_DATA) - 0x100000;
+
+		poke_fb(dev, fb, off, patt);
+		poke_fb(dev, fb, 0, 0);
+
+		peek_fb(dev, fb, 0);
+		peek_fb(dev, fb, 0);
+		peek_fb(dev, fb, 0);
+		peek_fb(dev, fb, 0);
+
+		if (peek_fb(dev, fb, off) == patt)
+			goto amount_found;
+	}
+
+	/* IC missing - disable the upper half memory space. */
+	bios_md32(bios, NV04_PFB_CFG0, 0x1000, 0);
+
+amount_found:
+	io_mapping_free(fb);
+	return 0;
+}
+
+static int
+nv20_init_compute_mem(struct nvbios *bios)
+{
+	struct drm_device *dev = bios->dev;
+	struct drm_nouveau_private *dev_priv = bios->dev->dev_private;
+	uint32_t mask = (dev_priv->chipset >= 0x25 ? 0x300 : 0x900);
+	uint32_t amount, off;
+	struct io_mapping *fb;
+
+	/* Map the framebuffer aperture */
+	fb = io_mapping_create_wc(pci_resource_start(dev->pdev, 1),
+				  pci_resource_len(dev->pdev, 1));
+	if (!fb)
+		return -ENOMEM;
+
+	bios_wr32(bios, NV10_PFB_REFCTRL, NV10_PFB_REFCTRL_VALID_1);
+
+	/* Allow full addressing */
+	bios_md32(bios, NV04_PFB_CFG0, 0, mask);
+
+	amount = bios_rd32(bios, NV04_PFB_FIFO_DATA);
+	for (off = amount; off > 0x2000000; off -= 0x2000000)
+		poke_fb(dev, fb, off - 4, off);
+
+	amount = bios_rd32(bios, NV04_PFB_FIFO_DATA);
+	if (amount != peek_fb(dev, fb, amount - 4))
+		/* IC missing - disable the upper half memory space. */
+		bios_md32(bios, NV04_PFB_CFG0, mask, 0);
+
+	io_mapping_free(fb);
+	return 0;
+}
+
 static int
 init_compute_mem(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 {
@@ -2047,64 +2479,57 @@ init_compute_mem(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 *
 	 * offset      (8 bit): opcode
 	 *
-	 * This opcode is meant to set NV_PFB_CFG0 (0x100200) appropriately so
-	 * that the hardware can correctly calculate how much VRAM it has
-	 * (and subsequently report that value in NV_PFB_CSTATUS (0x10020C))
+	 * This opcode is meant to set the PFB memory config registers
+	 * appropriately so that we can correctly calculate how much VRAM it
+	 * has (on nv10 and better chipsets the amount of installed VRAM is
+	 * subsequently reported in NV_PFB_CSTATUS (0x10020C)).
 	 *
-	 * The implementation of this opcode in general consists of two parts:
-	 * 1) determination of the memory bus width
-	 * 2) determination of how many of the card's RAM pads have ICs attached
+	 * The implementation of this opcode in general consists of several
+	 * parts:
 	 *
-	 * 1) is done by a cunning combination of writes to offsets 0x1c and
-	 * 0x3c in the framebuffer, and seeing whether the written values are
-	 * read back correctly. This then affects bits 4-7 of NV_PFB_CFG0
+	 * 1) Determination of memory type and density. Only necessary for
+	 *    really old chipsets, the memory type reported by the strap bits
+	 *    (0x101000) is assumed to be accurate on nv05 and newer.
 	 *
-	 * 2) is done by a cunning combination of writes to an offset slightly
-	 * less than the maximum memory reported by NV_PFB_CSTATUS, then seeing
-	 * if the test pattern can be read back. This then affects bits 12-15 of
-	 * NV_PFB_CFG0
+	 * 2) Determination of the memory bus width. Usually done by a cunning
+	 *    combination of writes to offsets 0x1c and 0x3c in the fb, and
+	 *    seeing whether the written values are read back correctly.
 	 *
-	 * In this context a "cunning combination" may include multiple reads
-	 * and writes to varying locations, often alternating the test pattern
-	 * and 0, doubtless to make sure buffers are filled, residual charges
-	 * on tracks are removed etc.
+	 *    Only necessary on nv0x-nv1x and nv34, on the other cards we can
+	 *    trust the straps.
 	 *
-	 * Unfortunately, the "cunning combination"s mentioned above, and the
-	 * changes to the bits in NV_PFB_CFG0 differ with nearly every bios
-	 * trace I have.
+	 * 3) Determination of how many of the card's RAM pads have ICs
+	 *    attached, usually done by a cunning combination of writes to an
+	 *    offset slightly less than the maximum memory reported by
+	 *    NV_PFB_CSTATUS, then seeing if the test pattern can be read back.
 	 *
-	 * Therefore, we cheat and assume the value of NV_PFB_CFG0 with which
-	 * we started was correct, and use that instead
+	 * This appears to be a NOP on IGPs and NV4x or newer chipsets, both io
+	 * logs of the VBIOS and kmmio traces of the binary driver POSTing the
+	 * card show nothing being done for this opcode. Why is it still listed
+	 * in the table?!
 	 */
 
 	/* no iexec->execute check by design */
 
-	/*
-	 * This appears to be a NOP on G8x chipsets, both io logs of the VBIOS
-	 * and kmmio traces of the binary driver POSTing the card show nothing
-	 * being done for this opcode.  why is it still listed in the table?!
-	 */
-
 	struct drm_nouveau_private *dev_priv = bios->dev->dev_private;
+	int ret;
 
-	if (dev_priv->card_type >= NV_40)
-		return 1;
-
-	/*
-	 * On every card I've seen, this step gets done for us earlier in
-	 * the init scripts
-	uint8_t crdata = bios_idxprt_rd(dev, NV_VIO_SRX, 0x01);
-	bios_idxprt_wr(dev, NV_VIO_SRX, 0x01, crdata | 0x20);
-	 */
-
-	/*
-	 * This also has probably been done in the scripts, but an mmio trace of
-	 * s3 resume shows nvidia doing it anyway (unlike the NV_VIO_SRX write)
-	 */
-	bios_wr32(bios, NV_PFB_REFCTRL, NV_PFB_REFCTRL_VALID_1);
+	if (dev_priv->chipset >= 0x40 ||
+	    dev_priv->chipset == 0x1a ||
+	    dev_priv->chipset == 0x1f)
+		ret = 0;
+	else if (dev_priv->chipset >= 0x20 &&
+		 dev_priv->chipset != 0x34)
+		ret = nv20_init_compute_mem(bios);
+	else if (dev_priv->chipset >= 0x10)
+		ret = nv10_init_compute_mem(bios);
+	else if (dev_priv->chipset >= 0x5)
+		ret = nv05_init_compute_mem(bios);
+	else
+		ret = nv04_init_compute_mem(bios);
 
-	/* write back the saved configuration value */
-	bios_wr32(bios, NV_PFB_CFG0, bios->state.saved_nv_pfb_cfg0);
+	if (ret)
+		return ret;
 
 	return 1;
 }
@@ -2131,7 +2556,8 @@ init_reset(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	/* no iexec->execute check by design */
 
 	pci_nv_19 = bios_rd32(bios, NV_PBUS_PCI_NV_19);
-	bios_wr32(bios, NV_PBUS_PCI_NV_19, 0);
+	bios_wr32(bios, NV_PBUS_PCI_NV_19, pci_nv_19 & ~0xf00);
+
 	bios_wr32(bios, reg, value1);
 
 	udelay(10);
@@ -2167,7 +2593,7 @@ init_configure_mem(struct nvbios *bios, uint16_t offset,
 	uint32_t reg, data;
 
 	if (bios->major_version > 2)
-		return -ENODEV;
+		return 0;
 
 	bios_idxprt_wr(bios, NV_VIO_SRX, NV_VIO_SR_CLOCK_INDEX, bios_idxprt_rd(
 		       bios, NV_VIO_SRX, NV_VIO_SR_CLOCK_INDEX) | 0x20);
@@ -2180,14 +2606,14 @@ init_configure_mem(struct nvbios *bios, uint16_t offset,
 	     reg = ROM32(bios->data[seqtbloffs += 4])) {
 
 		switch (reg) {
-		case NV_PFB_PRE:
-			data = NV_PFB_PRE_CMD_PRECHARGE;
+		case NV04_PFB_PRE:
+			data = NV04_PFB_PRE_CMD_PRECHARGE;
 			break;
-		case NV_PFB_PAD:
-			data = NV_PFB_PAD_CKE_NORMAL;
+		case NV04_PFB_PAD:
+			data = NV04_PFB_PAD_CKE_NORMAL;
 			break;
-		case NV_PFB_REF:
-			data = NV_PFB_REF_CMD_REFRESH;
+		case NV04_PFB_REF:
+			data = NV04_PFB_REF_CMD_REFRESH;
 			break;
 		default:
 			data = ROM32(bios->data[meminitdata]);
@@ -2222,7 +2648,7 @@ init_configure_clk(struct nvbios *bios, uint16_t offset,
 	int clock;
 
 	if (bios->major_version > 2)
-		return -ENODEV;
+		return 0;
 
 	clock = ROM16(bios->data[meminitoffs + 4]) * 10;
 	setPLL(bios, NV_PRAMDAC_NVPLL_COEFF, clock);
@@ -2252,10 +2678,10 @@ init_configure_preinit(struct nvbios *bios, uint16_t offset,
 	/* no iexec->execute check by design */
 
 	uint32_t straps = bios_rd32(bios, NV_PEXTDEV_BOOT_0);
-	uint8_t cr3c = ((straps << 2) & 0xf0) | (straps & (1 << 6));
+	uint8_t cr3c = ((straps << 2) & 0xf0) | (straps & 0x40) >> 6;
 
 	if (bios->major_version > 2)
-		return -ENODEV;
+		return 0;
 
 	bios_idxprt_wr(bios, NV_CIO_CRX__COLOR,
 			     NV_CIO_CRE_SCRATCH4__INDEX, cr3c);
@@ -2389,7 +2815,7 @@ init_ram_condition(struct nvbios *bios, uint16_t offset,
 	 * offset + 1  (8 bit): mask
 	 * offset + 2  (8 bit): cmpval
 	 *
-	 * Test if (NV_PFB_BOOT_0 & "mask") equals "cmpval".
+	 * Test if (NV04_PFB_BOOT_0 & "mask") equals "cmpval".
 	 * If condition not met skip subsequent opcodes until condition is
 	 * inverted (INIT_NOT), or we hit INIT_RESUME
 	 */
@@ -2401,7 +2827,7 @@ init_ram_condition(struct nvbios *bios, uint16_t offset,
 	if (!iexec->execute)
 		return 3;
 
-	data = bios_rd32(bios, NV_PFB_BOOT_0) & mask;
+	data = bios_rd32(bios, NV04_PFB_BOOT_0) & mask;
 
 	BIOSLOG(bios, "0x%04X: Checking if 0x%08X equals 0x%08X\n",
 		offset, data, cmpval);
@@ -2795,12 +3221,13 @@ init_gpio(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 	 */
 
 	struct drm_nouveau_private *dev_priv = bios->dev->dev_private;
+	struct nouveau_gpio_engine *pgpio = &dev_priv->engine.gpio;
 	const uint32_t nv50_gpio_ctl[2] = { 0xe100, 0xe28c };
 	int i;
 
-	if (dev_priv->card_type != NV_50) {
+	if (dev_priv->card_type < NV_50) {
 		NV_ERROR(bios->dev, "INIT_GPIO on unsupported chipset\n");
-		return -ENODEV;
+		return 1;
 	}
 
 	if (!iexec->execute)
@@ -2815,7 +3242,7 @@ init_gpio(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		BIOSLOG(bios, "0x%04X: set gpio 0x%02x, state %d\n",
 			offset, gpio->tag, gpio->state_default);
 		if (bios->execute)
-			nv50_gpio_set(bios->dev, gpio->tag, gpio->state_default);
+			pgpio->set(bios->dev, gpio->tag, gpio->state_default);
 
 		/* The NVIDIA binary driver doesn't appear to actually do
 		 * any of this, my VBIOS does however.
@@ -2872,10 +3299,7 @@ init_ram_restrict_zm_reg_group(struct nvbios *bios, uint16_t offset,
 	uint8_t index;
 	int i;
 
-
-	if (!iexec->execute)
-		return len;
-
+	/* critical! to know the length of the opcode */;
 	if (!blocklen) {
 		NV_ERROR(bios->dev,
 			 "0x%04X: Zero block length - has the M table "
@@ -2883,6 +3307,9 @@ init_ram_restrict_zm_reg_group(struct nvbios *bios, uint16_t offset,
 		return -EINVAL;
 	}
 
+	if (!iexec->execute)
+		return len;
+
 	strap_ramcfg = (bios_rd32(bios, NV_PEXTDEV_BOOT_0) >> 2) & 0xf;
 	index = bios->data[bios->ram_restrict_tbl_ptr + strap_ramcfg];
 
@@ -3064,14 +3491,14 @@ init_auxch(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 
 	if (!bios->display.output) {
 		NV_ERROR(dev, "INIT_AUXCH: no active output\n");
-		return -EINVAL;
+		return len;
 	}
 
 	auxch = init_i2c_device_find(dev, bios->display.output->i2c_index);
 	if (!auxch) {
 		NV_ERROR(dev, "INIT_AUXCH: couldn't get auxch %d\n",
 			 bios->display.output->i2c_index);
-		return -ENODEV;
+		return len;
 	}
 
 	if (!iexec->execute)
@@ -3084,7 +3511,7 @@ init_auxch(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = nouveau_dp_auxch(auxch, 9, addr, &data, 1);
 		if (ret) {
 			NV_ERROR(dev, "INIT_AUXCH: rd auxch fail %d\n", ret);
-			return ret;
+			return len;
 		}
 
 		data &= bios->data[offset + 0];
@@ -3093,7 +3520,7 @@ init_auxch(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = nouveau_dp_auxch(auxch, 8, addr, &data, 1);
 		if (ret) {
 			NV_ERROR(dev, "INIT_AUXCH: wr auxch fail %d\n", ret);
-			return ret;
+			return len;
 		}
 	}
 
@@ -3123,14 +3550,14 @@ init_zm_auxch(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 
 	if (!bios->display.output) {
 		NV_ERROR(dev, "INIT_ZM_AUXCH: no active output\n");
-		return -EINVAL;
+		return len;
 	}
 
 	auxch = init_i2c_device_find(dev, bios->display.output->i2c_index);
 	if (!auxch) {
 		NV_ERROR(dev, "INIT_ZM_AUXCH: couldn't get auxch %d\n",
 			 bios->display.output->i2c_index);
-		return -ENODEV;
+		return len;
 	}
 
 	if (!iexec->execute)
@@ -3141,13 +3568,76 @@ init_zm_auxch(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
 		ret = nouveau_dp_auxch(auxch, 8, addr, &bios->data[offset], 1);
 		if (ret) {
 			NV_ERROR(dev, "INIT_ZM_AUXCH: wr auxch fail %d\n", ret);
-			return ret;
+			return len;
 		}
 	}
 
 	return len;
 }
 
+static int
+init_i2c_long_if(struct nvbios *bios, uint16_t offset, struct init_exec *iexec)
+{
+	/*
+	 * INIT_I2C_LONG_IF   opcode: 0x9A ('')
+	 *
+	 * offset      (8 bit): opcode
+	 * offset + 1  (8 bit): DCB I2C table entry index
+	 * offset + 2  (8 bit): I2C slave address
+	 * offset + 3  (16 bit): I2C register
+	 * offset + 5  (8 bit): mask
+	 * offset + 6  (8 bit): data
+	 *
+	 * Read the register given by "I2C register" on the device addressed
+	 * by "I2C slave address" on the I2C bus given by "DCB I2C table
+	 * entry index". Compare the result AND "mask" to "data".
+	 * If they're not equal, skip subsequent opcodes until condition is
+	 * inverted (INIT_NOT), or we hit INIT_RESUME
+	 */
+
+	uint8_t i2c_index = bios->data[offset + 1];
+	uint8_t i2c_address = bios->data[offset + 2] >> 1;
+	uint8_t reglo = bios->data[offset + 3];
+	uint8_t reghi = bios->data[offset + 4];
+	uint8_t mask = bios->data[offset + 5];
+	uint8_t data = bios->data[offset + 6];
+	struct nouveau_i2c_chan *chan;
+	uint8_t buf0[2] = { reghi, reglo };
+	uint8_t buf1[1];
+	struct i2c_msg msg[2] = {
+		{ i2c_address, 0, 1, buf0 },
+		{ i2c_address, I2C_M_RD, 1, buf1 },
+	};
+	int ret;
+
+	/* no execute check by design */
+
+	BIOSLOG(bios, "0x%04X: DCBI2CIndex: 0x%02X, I2CAddress: 0x%02X\n",
+		offset, i2c_index, i2c_address);
+
+	chan = init_i2c_device_find(bios->dev, i2c_index);
+	if (!chan)
+		return -ENODEV;
+
+
+	ret = i2c_transfer(&chan->adapter, msg, 2);
+	if (ret < 0) {
+		BIOSLOG(bios, "0x%04X: I2CReg: 0x%02X:0x%02X, Value: [no device], "
+			      "Mask: 0x%02X, Data: 0x%02X\n",
+			offset, reghi, reglo, mask, data);
+		iexec->execute = 0;
+		return 7;
+	}
+
+	BIOSLOG(bios, "0x%04X: I2CReg: 0x%02X:0x%02X, Value: 0x%02X, "
+		      "Mask: 0x%02X, Data: 0x%02X\n",
+		offset, reghi, reglo, buf1[0], mask, data);
+
+	iexec->execute = ((buf1[0] & mask) == data);
+
+	return 7;
+}
+
 static struct init_tbl_entry itbl_entry[] = {
 	/* command name                       , id  , length  , offset  , mult    , command handler                 */
 	/* INIT_PROG (0x31, 15, 10, 4) removed due to no example of use */
@@ -3174,9 +3664,11 @@ static struct init_tbl_entry itbl_entry[] = {
 	{ "INIT_ZM_CR"                        , 0x53, init_zm_cr                      },
 	{ "INIT_ZM_CR_GROUP"                  , 0x54, init_zm_cr_group                },
 	{ "INIT_CONDITION_TIME"               , 0x56, init_condition_time             },
+	{ "INIT_LTIME"                        , 0x57, init_ltime                      },
 	{ "INIT_ZM_REG_SEQUENCE"              , 0x58, init_zm_reg_sequence            },
 	/* INIT_INDIRECT_REG (0x5A, 7, 0, 0) removed due to no example of use */
 	{ "INIT_SUB_DIRECT"                   , 0x5B, init_sub_direct                 },
+	{ "INIT_I2C_IF"                       , 0x5E, init_i2c_if                     },
 	{ "INIT_COPY_NV_REG"                  , 0x5F, init_copy_nv_reg                },
 	{ "INIT_ZM_INDEX_IO"                  , 0x62, init_zm_index_io                },
 	{ "INIT_COMPUTE_MEM"                  , 0x63, init_compute_mem                },
@@ -3210,6 +3702,7 @@ static struct init_tbl_entry itbl_entry[] = {
 	{ "INIT_97"                           , 0x97, init_97                         },
 	{ "INIT_AUXCH"                        , 0x98, init_auxch                      },
 	{ "INIT_ZM_AUXCH"                     , 0x99, init_zm_auxch                   },
+	{ "INIT_I2C_LONG_IF"                  , 0x9A, init_i2c_long_if                },
 	{ NULL                                , 0   , NULL                            }
 };
 
@@ -3376,27 +3869,10 @@ static int call_lvds_manufacturer_script(struct drm_device *dev, struct dcb_entr
 	}
 #ifdef __powerpc__
 	/* Powerbook specific quirks */
-	if ((dev->pci_device & 0xffff) == 0x0179 ||
-	    (dev->pci_device & 0xffff) == 0x0189 ||
-	    (dev->pci_device & 0xffff) == 0x0329) {
-		if (script == LVDS_RESET) {
-			nv_write_tmds(dev, dcbent->or, 0, 0x02, 0x72);
-
-		} else if (script == LVDS_PANEL_ON) {
-			bios_wr32(bios, NV_PBUS_DEBUG_DUALHEAD_CTL,
-				  bios_rd32(bios, NV_PBUS_DEBUG_DUALHEAD_CTL)
-				  | (1 << 31));
-			bios_wr32(bios, NV_PCRTC_GPIO_EXT,
-				  bios_rd32(bios, NV_PCRTC_GPIO_EXT) | 1);
-
-		} else if (script == LVDS_PANEL_OFF) {
-			bios_wr32(bios, NV_PBUS_DEBUG_DUALHEAD_CTL,
-				  bios_rd32(bios, NV_PBUS_DEBUG_DUALHEAD_CTL)
-				  & ~(1 << 31));
-			bios_wr32(bios, NV_PCRTC_GPIO_EXT,
-				  bios_rd32(bios, NV_PCRTC_GPIO_EXT) & ~3);
-		}
-	}
+	if (script == LVDS_RESET &&
+	    (dev->pci_device == 0x0179 || dev->pci_device == 0x0189 ||
+	     dev->pci_device == 0x0329))
+		nv_write_tmds(dev, dcbent->or, 0, 0x02, 0x72);
 #endif
 
 	return 0;
@@ -3888,11 +4364,8 @@ int nouveau_bios_parse_lvds_table(struct drm_device *dev, int pxclk, bool *dl, b
 	 *
 	 * For the moment, a quirk will do :)
 	 */
-	if ((dev->pdev->device == 0x01d7) &&
-	    (dev->pdev->subsystem_vendor == 0x1028) &&
-	    (dev->pdev->subsystem_device == 0x01c2)) {
+	if (nv_match_device(dev, 0x01d7, 0x1028, 0x01c2))
 		bios->fp.duallink_transition_clk = 80000;
-	}
 
 	/* set dual_link flag for EDID case */
 	if (pxclk && (chip_version < 0x25 || chip_version > 0x28))
@@ -4068,7 +4541,7 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 					  bios->display.script_table_ptr,
 					  table[2], table[3], table[0] >= 0x21);
 	if (!otable) {
-		NV_ERROR(dev, "Couldn't find matching output script table\n");
+		NV_DEBUG_KMS(dev, "failed to match any output table\n");
 		return 1;
 	}
 
@@ -4094,7 +4567,7 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 			return 1;
 		}
 
-		NV_TRACE(dev, "0x%04X: parsing output script 0\n", script);
+		NV_DEBUG_KMS(dev, "0x%04X: parsing output script 0\n", script);
 		nouveau_bios_run_init_table(dev, script, dcbent);
 	} else
 	if (pxclk == -1) {
@@ -4104,7 +4577,7 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 			return 1;
 		}
 
-		NV_TRACE(dev, "0x%04X: parsing output script 1\n", script);
+		NV_DEBUG_KMS(dev, "0x%04X: parsing output script 1\n", script);
 		nouveau_bios_run_init_table(dev, script, dcbent);
 	} else
 	if (pxclk == -2) {
@@ -4117,7 +4590,7 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 			return 1;
 		}
 
-		NV_TRACE(dev, "0x%04X: parsing output script 2\n", script);
+		NV_DEBUG_KMS(dev, "0x%04X: parsing output script 2\n", script);
 		nouveau_bios_run_init_table(dev, script, dcbent);
 	} else
 	if (pxclk > 0) {
@@ -4125,11 +4598,11 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 		if (script)
 			script = clkcmptable(bios, script, pxclk);
 		if (!script) {
-			NV_ERROR(dev, "clock script 0 not found\n");
+			NV_DEBUG_KMS(dev, "clock script 0 not found\n");
 			return 1;
 		}
 
-		NV_TRACE(dev, "0x%04X: parsing clock script 0\n", script);
+		NV_DEBUG_KMS(dev, "0x%04X: parsing clock script 0\n", script);
 		nouveau_bios_run_init_table(dev, script, dcbent);
 	} else
 	if (pxclk < 0) {
@@ -4141,7 +4614,7 @@ nouveau_bios_run_display_table(struct drm_device *dev, struct dcb_entry *dcbent,
 			return 1;
 		}
 
-		NV_TRACE(dev, "0x%04X: parsing clock script 1\n", script);
+		NV_DEBUG_KMS(dev, "0x%04X: parsing clock script 1\n", script);
 		nouveau_bios_run_init_table(dev, script, dcbent);
 	}
 
@@ -4484,7 +4957,7 @@ int get_pll_limits(struct drm_device *dev, uint32_t limit_match, struct pll_lims
 		pll_lim->min_p = record[12];
 		pll_lim->max_p = record[13];
 		/* where did this go to?? */
-		if (limit_match == 0x00614100 || limit_match == 0x00614900)
+		if ((entry[0] & 0xf0) == 0x80)
 			pll_lim->refclk = 27000;
 		else
 			pll_lim->refclk = 100000;
@@ -4864,19 +5337,17 @@ static int parse_bit_tmds_tbl_entry(struct drm_device *dev, struct nvbios *bios,
 	}
 
 	tmdstableptr = ROM16(bios->data[bitentry->offset]);
-
-	if (tmdstableptr == 0x0) {
+	if (!tmdstableptr) {
 		NV_ERROR(dev, "Pointer to TMDS table invalid\n");
 		return -EINVAL;
 	}
 
+	NV_INFO(dev, "TMDS table version %d.%d\n",
+		bios->data[tmdstableptr] >> 4, bios->data[tmdstableptr] & 0xf);
+
 	/* nv50+ has v2.0, but we don't parse it atm */
-	if (bios->data[tmdstableptr] != 0x11) {
-		NV_WARN(dev,
-			"TMDS table revision %d.%d not currently supported\n",
-			bios->data[tmdstableptr] >> 4, bios->data[tmdstableptr] & 0xf);
+	if (bios->data[tmdstableptr] != 0x11)
 		return -ENOSYS;
-	}
 
 	/*
 	 * These two scripts are odd: they don't seem to get run even when
@@ -5151,10 +5622,14 @@ static int parse_bmp_structure(struct drm_device *dev, struct nvbios *bios, unsi
 	bios->legacy.i2c_indices.crt = bios->data[legacy_i2c_offset];
 	bios->legacy.i2c_indices.tv = bios->data[legacy_i2c_offset + 1];
 	bios->legacy.i2c_indices.panel = bios->data[legacy_i2c_offset + 2];
-	bios->dcb.i2c[0].write = bios->data[legacy_i2c_offset + 4];
-	bios->dcb.i2c[0].read = bios->data[legacy_i2c_offset + 5];
-	bios->dcb.i2c[1].write = bios->data[legacy_i2c_offset + 6];
-	bios->dcb.i2c[1].read = bios->data[legacy_i2c_offset + 7];
+	if (bios->data[legacy_i2c_offset + 4])
+		bios->dcb.i2c[0].write = bios->data[legacy_i2c_offset + 4];
+	if (bios->data[legacy_i2c_offset + 5])
+		bios->dcb.i2c[0].read = bios->data[legacy_i2c_offset + 5];
+	if (bios->data[legacy_i2c_offset + 6])
+		bios->dcb.i2c[1].write = bios->data[legacy_i2c_offset + 6];
+	if (bios->data[legacy_i2c_offset + 7])
+		bios->dcb.i2c[1].read = bios->data[legacy_i2c_offset + 7];
 
 	if (bmplength > 74) {
 		bios->fmaxvco = ROM32(bmp[67]);
@@ -5312,6 +5787,20 @@ parse_dcb_gpio_table(struct nvbios *bios)
 			gpio->line = tvdac_gpio[1] >> 4;
 			gpio->invert = tvdac_gpio[0] & 2;
 		}
+	} else {
+		/*
+		 * No systematic way to store GPIO info on pre-v2.2
+		 * DCBs, try to match the PCI device IDs.
+		 */
+
+		/* Apple iMac G4 NV18 */
+		if (nv_match_device(dev, 0x0189, 0x10de, 0x0010)) {
+			struct dcb_gpio_entry *gpio = new_gpio_entry(bios);
+
+			gpio->tag = DCB_GPIO_TVDAC0;
+			gpio->line = 4;
+		}
+
 	}
 
 	if (!gpio_table_ptr)
@@ -5387,9 +5876,7 @@ apply_dcb_connector_quirks(struct nvbios *bios, int idx)
 	struct drm_device *dev = bios->dev;
 
 	/* Gigabyte NX85T */
-	if ((dev->pdev->device == 0x0421) &&
-	    (dev->pdev->subsystem_vendor == 0x1458) &&
-	    (dev->pdev->subsystem_device == 0x344c)) {
+	if (nv_match_device(dev, 0x0421, 0x1458, 0x344c)) {
 		if (cte->type == DCB_CONNECTOR_HDMI_1)
 			cte->type = DCB_CONNECTOR_DVI_I;
 	}
@@ -5506,7 +5993,7 @@ static void fabricate_vga_output(struct dcb_table *dcb, int i2c, int heads)
 	entry->i2c_index = i2c;
 	entry->heads = heads;
 	entry->location = DCB_LOC_ON_CHIP;
-	/* "or" mostly unused in early gen crt modesetting, 0 is fine */
+	entry->or = 1;
 }
 
 static void fabricate_dvi_i_output(struct dcb_table *dcb, bool twoHeads)
@@ -5589,9 +6076,12 @@ parse_dcb20_entry(struct drm_device *dev, struct dcb_table *dcb,
 			if (conf & 0x4 || conf & 0x8)
 				entry->lvdsconf.use_power_scripts = true;
 		} else {
-			mask = ~0x5;
+			mask = ~0x7;
+			if (conf & 0x2)
+				entry->lvdsconf.use_acpi_for_edid = true;
 			if (conf & 0x4)
 				entry->lvdsconf.use_power_scripts = true;
+			entry->lvdsconf.sor.link = (conf & 0x00000030) >> 4;
 		}
 		if (conf & mask) {
 			/*
@@ -5631,9 +6121,15 @@ parse_dcb20_entry(struct drm_device *dev, struct dcb_table *dcb,
 		}
 		break;
 	case OUTPUT_TMDS:
-		entry->tmdsconf.sor.link = (conf & 0x00000030) >> 4;
+		if (dcb->version >= 0x40)
+			entry->tmdsconf.sor.link = (conf & 0x00000030) >> 4;
+		else if (dcb->version >= 0x30)
+			entry->tmdsconf.slave_addr = (conf & 0x00000700) >> 8;
+		else if (dcb->version >= 0x22)
+			entry->tmdsconf.slave_addr = (conf & 0x00000070) >> 4;
+
 		break;
-	case 0xe:
+	case OUTPUT_EOL:
 		/* weird g80 mobile type that "nv" treats as a terminator */
 		dcb->entries--;
 		return false;
@@ -5670,22 +6166,14 @@ parse_dcb15_entry(struct drm_device *dev, struct dcb_table *dcb,
 		entry->type = OUTPUT_TV;
 		break;
 	case 2:
-	case 3:
-		entry->type = OUTPUT_LVDS;
-		break;
 	case 4:
-		switch ((conn & 0x000000f0) >> 4) {
-		case 0:
-			entry->type = OUTPUT_TMDS;
-			break;
-		case 1:
+		if (conn & 0x10)
 			entry->type = OUTPUT_LVDS;
-			break;
-		default:
-			NV_ERROR(dev, "Unknown DCB subtype 4/%d\n",
-				 (conn & 0x000000f0) >> 4);
-			return false;
-		}
+		else
+			entry->type = OUTPUT_TMDS;
+		break;
+	case 3:
+		entry->type = OUTPUT_LVDS;
 		break;
 	default:
 		NV_ERROR(dev, "Unknown DCB type %d\n", conn & 0x0000000f);
@@ -5706,13 +6194,6 @@ parse_dcb15_entry(struct drm_device *dev, struct dcb_table *dcb,
 	case OUTPUT_TV:
 		entry->tvconf.has_component_output = false;
 		break;
-	case OUTPUT_TMDS:
-		/*
-		 * Invent a DVI-A output, by copying the fields of the DVI-D
-		 * output; reported to work by math_b on an NV20(!).
-		 */
-		fabricate_vga_output(dcb, entry->i2c_index, entry->heads);
-		break;
 	case OUTPUT_LVDS:
 		if ((conn & 0x00003f00) != 0x10)
 			entry->lvdsconf.use_straps_for_mode = true;
@@ -5793,6 +6274,29 @@ void merge_like_dcb_entries(struct drm_device *dev, struct dcb_table *dcb)
 	dcb->entries = newentries;
 }
 
+static bool
+apply_dcb_encoder_quirks(struct drm_device *dev, int idx, u32 *conn, u32 *conf)
+{
+	/* Dell Precision M6300
+	 *   DCB entry 2: 02025312 00000010
+	 *   DCB entry 3: 02026312 00000020
+	 *
+	 * Identical, except apparently a different connector on a
+	 * different SOR link.  Not a clue how we're supposed to know
+	 * which one is in use if it even shares an i2c line...
+	 *
+	 * Ignore the connector on the second SOR link to prevent
+	 * nasty problems until this is sorted (assuming it's not a
+	 * VBIOS bug).
+	 */
+	if (nv_match_device(dev, 0x040d, 0x1028, 0x019b)) {
+		if (*conn == 0x02026312 && *conf == 0x00000020)
+			return false;
+	}
+
+	return true;
+}
+
 static int
 parse_dcb_table(struct drm_device *dev, struct nvbios *bios, bool twoHeads)
 {
@@ -5903,6 +6407,19 @@ parse_dcb_table(struct drm_device *dev, struct nvbios *bios, bool twoHeads)
 		dcb->i2c_table = &bios->data[i2ctabptr];
 		if (dcb->version >= 0x30)
 			dcb->i2c_default_indices = dcb->i2c_table[4];
+
+		/*
+		 * Parse the "management" I2C bus, used for hardware
+		 * monitoring and some external TMDS transmitters.
+		 */
+		if (dcb->version >= 0x22) {
+			int idx = (dcb->version >= 0x40 ?
+				   dcb->i2c_default_indices & 0xf :
+				   2);
+
+			read_dcb_i2c_entry(dev, dcb->version, dcb->i2c_table,
+					   idx, &dcb->i2c[idx]);
+		}
 	}
 
 	if (entries > DCB_MAX_NUM_ENTRIES)
@@ -5926,6 +6443,9 @@ parse_dcb_table(struct drm_device *dev, struct nvbios *bios, bool twoHeads)
 		if ((connection & 0x0000000f) == 0x0000000f)
 			continue;
 
+		if (!apply_dcb_encoder_quirks(dev, i, &connection, &config))
+			continue;
+
 		NV_TRACEWARN(dev, "Raw DCB entry %d: %08x %08x\n",
 			     dcb->entries, connection, config);
 
@@ -6181,9 +6701,8 @@ nouveau_run_vbios_init(struct drm_device *dev)
 	struct nvbios *bios = &dev_priv->vbios;
 	int i, ret = 0;
 
-	NVLockVgaCrtcs(dev, false);
-	if (nv_two_heads(dev))
-		NVSetOwner(dev, bios->state.crtchead);
+	/* Reset the BIOS head to 0. */
+	bios->state.crtchead = 0;
 
 	if (bios->major_version < 5)	/* BMP only */
 		load_nv17_hw_sequencer_ucode(dev, bios);
@@ -6216,8 +6735,6 @@ nouveau_run_vbios_init(struct drm_device *dev)
 		}
 	}
 
-	NVLockVgaCrtcs(dev, true);
-
 	return ret;
 }
 
@@ -6238,7 +6755,6 @@ static bool
 nouveau_bios_posted(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	bool was_locked;
 	unsigned htotal;
 
 	if (dev_priv->chipset >= NV_50) {
@@ -6248,13 +6764,12 @@ nouveau_bios_posted(struct drm_device *dev)
 		return true;
 	}
 
-	was_locked = NVLockVgaCrtcs(dev, false);
 	htotal  = NVReadVgaCrtc(dev, 0, 0x06);
 	htotal |= (NVReadVgaCrtc(dev, 0, 0x07) & 0x01) << 8;
 	htotal |= (NVReadVgaCrtc(dev, 0, 0x07) & 0x20) << 4;
 	htotal |= (NVReadVgaCrtc(dev, 0, 0x25) & 0x01) << 10;
 	htotal |= (NVReadVgaCrtc(dev, 0, 0x41) & 0x01) << 11;
-	NVLockVgaCrtcs(dev, was_locked);
+
 	return (htotal != 0);
 }
 
@@ -6263,8 +6778,6 @@ nouveau_bios_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nvbios *bios = &dev_priv->vbios;
-	uint32_t saved_nv_pextdev_boot_0;
-	bool was_locked;
 	int ret;
 
 	if (!NVInitVBIOS(dev))
@@ -6284,40 +6797,29 @@ nouveau_bios_init(struct drm_device *dev)
 	if (!bios->major_version)	/* we don't run version 0 bios */
 		return 0;
 
-	/* these will need remembering across a suspend */
-	saved_nv_pextdev_boot_0 = bios_rd32(bios, NV_PEXTDEV_BOOT_0);
-	bios->state.saved_nv_pfb_cfg0 = bios_rd32(bios, NV_PFB_CFG0);
-
 	/* init script execution disabled */
 	bios->execute = false;
 
 	/* ... unless card isn't POSTed already */
 	if (!nouveau_bios_posted(dev)) {
-		NV_INFO(dev, "Adaptor not initialised\n");
-		if (dev_priv->card_type < NV_40) {
-			NV_ERROR(dev, "Unable to POST this chipset\n");
-			return -ENODEV;
-		}
-
-		NV_INFO(dev, "Running VBIOS init tables\n");
+		NV_INFO(dev, "Adaptor not initialised, "
+			"running VBIOS init tables.\n");
 		bios->execute = true;
 	}
-
-	bios_wr32(bios, NV_PEXTDEV_BOOT_0, saved_nv_pextdev_boot_0);
+	if (nouveau_force_post)
+		bios->execute = true;
 
 	ret = nouveau_run_vbios_init(dev);
 	if (ret)
 		return ret;
 
 	/* feature_byte on BMP is poor, but init always sets CR4B */
-	was_locked = NVLockVgaCrtcs(dev, false);
 	if (bios->major_version < 5)
 		bios->is_mobile = NVReadVgaCrtc(dev, 0, NV_CIO_CRE_4B) & 0x40;
 
 	/* all BIT systems need p_f_m_t for digital_min_front_porch */
 	if (bios->is_mobile || bios->major_version >= 5)
 		ret = parse_fp_mode_table(dev, bios);
-	NVLockVgaCrtcs(dev, was_locked);
 
 	/* allow subsequent scripts to execute */
 	bios->execute = true;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.h b/drivers/gpu/drm/nouveau/nouveau_bios.h
index adf4ec2..c1de2f3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.h
@@ -81,6 +81,7 @@ struct dcb_connector_table_entry {
 	enum dcb_connector_type type;
 	uint8_t index2;
 	uint8_t gpio_tag;
+	void *drm;
 };
 
 struct dcb_connector_table {
@@ -94,6 +95,7 @@ enum dcb_type {
 	OUTPUT_TMDS = 2,
 	OUTPUT_LVDS = 3,
 	OUTPUT_DP = 6,
+	OUTPUT_EOL = 14, /* DCB 4.0+, appears to be end-of-list */
 	OUTPUT_ANY = -1
 };
 
@@ -117,6 +119,7 @@ struct dcb_entry {
 		struct {
 			struct sor_conf sor;
 			bool use_straps_for_mode;
+			bool use_acpi_for_edid;
 			bool use_power_scripts;
 		} lvdsconf;
 		struct {
@@ -129,6 +132,7 @@ struct dcb_entry {
 		} dpconf;
 		struct {
 			struct sor_conf sor;
+			int slave_addr;
 		} tmdsconf;
 	};
 	bool i2c_upper_default;
@@ -249,8 +253,6 @@ struct nvbios {
 
 	struct {
 		int crtchead;
-		/* these need remembering across suspend */
-		uint32_t saved_nv_pfb_cfg0;
 	} state;
 
 	struct {
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 51746d9..a4011f5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -58,17 +58,12 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
 	struct drm_device *dev = dev_priv->dev;
 	struct nouveau_bo *nvbo = nouveau_bo(bo);
 
-	ttm_bo_kunmap(&nvbo->kmap);
-
 	if (unlikely(nvbo->gem))
 		DRM_ERROR("bo %p still attached to GEM object\n", bo);
 
 	if (nvbo->tile)
 		nv10_mem_expire_tiling(dev, nvbo->tile, NULL);
 
-	spin_lock(&dev_priv->ttm.bo_list_lock);
-	list_del(&nvbo->head);
-	spin_unlock(&dev_priv->ttm.bo_list_lock);
 	kfree(nvbo);
 }
 
@@ -167,8 +162,6 @@ nouveau_bo_new(struct drm_device *dev, struct nouveau_channel *chan,
 	nouveau_bo_fixup_align(dev, tile_mode, tile_flags, &align, &size);
 	align >>= PAGE_SHIFT;
 
-	nvbo->placement.fpfn = 0;
-	nvbo->placement.lpfn = mappable ? dev_priv->fb_mappable_pages : 0;
 	nouveau_bo_placement_set(nvbo, flags, 0);
 
 	nvbo->channel = chan;
@@ -181,9 +174,6 @@ nouveau_bo_new(struct drm_device *dev, struct nouveau_channel *chan,
 	}
 	nvbo->channel = NULL;
 
-	spin_lock(&dev_priv->ttm.bo_list_lock);
-	list_add_tail(&nvbo->head, &dev_priv->ttm.bo_list);
-	spin_unlock(&dev_priv->ttm.bo_list_lock);
 	*pnvbo = nvbo;
 	return 0;
 }
@@ -311,7 +301,8 @@ nouveau_bo_map(struct nouveau_bo *nvbo)
 void
 nouveau_bo_unmap(struct nouveau_bo *nvbo)
 {
-	ttm_bo_kunmap(&nvbo->kmap);
+	if (nvbo)
+		ttm_bo_kunmap(&nvbo->kmap);
 }
 
 u16
@@ -410,7 +401,10 @@ nouveau_bo_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 		man->available_caching = TTM_PL_FLAG_UNCACHED |
 					 TTM_PL_FLAG_WC;
 		man->default_caching = TTM_PL_FLAG_WC;
-		man->gpu_offset = dev_priv->vm_vram_base;
+		if (dev_priv->card_type == NV_50)
+			man->gpu_offset = 0x40000000;
+		else
+			man->gpu_offset = 0;
 		break;
 	case TTM_PL_TT:
 		switch (dev_priv->gart_info.type) {
@@ -476,18 +470,20 @@ nouveau_bo_move_accel_cleanup(struct nouveau_channel *chan,
 		return ret;
 
 	ret = ttm_bo_move_accel_cleanup(&nvbo->bo, fence, NULL,
-					evict, no_wait_reserve, no_wait_gpu, new_mem);
-	if (nvbo->channel && nvbo->channel != chan)
-		ret = nouveau_fence_wait(fence, NULL, false, false);
+					evict || (nvbo->channel &&
+						  nvbo->channel != chan),
+					no_wait_reserve, no_wait_gpu, new_mem);
 	nouveau_fence_unref((void *)&fence);
 	return ret;
 }
 
 static inline uint32_t
-nouveau_bo_mem_ctxdma(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
-		      struct ttm_mem_reg *mem)
+nouveau_bo_mem_ctxdma(struct ttm_buffer_object *bo,
+		      struct nouveau_channel *chan, struct ttm_mem_reg *mem)
 {
-	if (chan == nouveau_bdev(nvbo->bo.bdev)->channel) {
+	struct nouveau_bo *nvbo = nouveau_bo(bo);
+
+	if (nvbo->no_vm) {
 		if (mem->mem_type == TTM_PL_TT)
 			return NvDmaGART;
 		return NvDmaVRAM;
@@ -499,86 +495,181 @@ nouveau_bo_mem_ctxdma(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
 }
 
 static int
-nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, bool intr,
-		     bool no_wait_reserve, bool no_wait_gpu,
-		     struct ttm_mem_reg *new_mem)
+nv50_bo_move_m2mf(struct nouveau_channel *chan, struct ttm_buffer_object *bo,
+		  struct ttm_mem_reg *old_mem, struct ttm_mem_reg *new_mem)
 {
-	struct nouveau_bo *nvbo = nouveau_bo(bo);
 	struct drm_nouveau_private *dev_priv = nouveau_bdev(bo->bdev);
-	struct ttm_mem_reg *old_mem = &bo->mem;
-	struct nouveau_channel *chan;
-	uint64_t src_offset, dst_offset;
-	uint32_t page_count;
+	struct nouveau_bo *nvbo = nouveau_bo(bo);
+	u64 length = (new_mem->num_pages << PAGE_SHIFT);
+	u64 src_offset, dst_offset;
 	int ret;
 
-	chan = nvbo->channel;
-	if (!chan || nvbo->tile_flags || nvbo->no_vm)
-		chan = dev_priv->channel;
-
 	src_offset = old_mem->mm_node->start << PAGE_SHIFT;
 	dst_offset = new_mem->mm_node->start << PAGE_SHIFT;
-	if (chan != dev_priv->channel) {
-		if (old_mem->mem_type == TTM_PL_TT)
-			src_offset += dev_priv->vm_gart_base;
-		else
+	if (!nvbo->no_vm) {
+		if (old_mem->mem_type == TTM_PL_VRAM)
 			src_offset += dev_priv->vm_vram_base;
-
-		if (new_mem->mem_type == TTM_PL_TT)
-			dst_offset += dev_priv->vm_gart_base;
 		else
+			src_offset += dev_priv->vm_gart_base;
+
+		if (new_mem->mem_type == TTM_PL_VRAM)
 			dst_offset += dev_priv->vm_vram_base;
+		else
+			dst_offset += dev_priv->vm_gart_base;
 	}
 
 	ret = RING_SPACE(chan, 3);
 	if (ret)
 		return ret;
-	BEGIN_RING(chan, NvSubM2MF, NV_MEMORY_TO_MEMORY_FORMAT_DMA_SOURCE, 2);
-	OUT_RING(chan, nouveau_bo_mem_ctxdma(nvbo, chan, old_mem));
-	OUT_RING(chan, nouveau_bo_mem_ctxdma(nvbo, chan, new_mem));
 
-	if (dev_priv->card_type >= NV_50) {
-		ret = RING_SPACE(chan, 4);
+	BEGIN_RING(chan, NvSubM2MF, 0x0184, 2);
+	OUT_RING  (chan, nouveau_bo_mem_ctxdma(bo, chan, old_mem));
+	OUT_RING  (chan, nouveau_bo_mem_ctxdma(bo, chan, new_mem));
+
+	while (length) {
+		u32 amount, stride, height;
+
+		amount  = min(length, (u64)(4 * 1024 * 1024));
+		stride  = 16 * 4;
+		height  = amount / stride;
+
+		if (new_mem->mem_type == TTM_PL_VRAM && nvbo->tile_flags) {
+			ret = RING_SPACE(chan, 8);
+			if (ret)
+				return ret;
+
+			BEGIN_RING(chan, NvSubM2MF, 0x0200, 7);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, stride);
+			OUT_RING  (chan, height);
+			OUT_RING  (chan, 1);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, 0);
+		} else {
+			ret = RING_SPACE(chan, 2);
+			if (ret)
+				return ret;
+
+			BEGIN_RING(chan, NvSubM2MF, 0x0200, 1);
+			OUT_RING  (chan, 1);
+		}
+		if (old_mem->mem_type == TTM_PL_VRAM && nvbo->tile_flags) {
+			ret = RING_SPACE(chan, 8);
+			if (ret)
+				return ret;
+
+			BEGIN_RING(chan, NvSubM2MF, 0x021c, 7);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, stride);
+			OUT_RING  (chan, height);
+			OUT_RING  (chan, 1);
+			OUT_RING  (chan, 0);
+			OUT_RING  (chan, 0);
+		} else {
+			ret = RING_SPACE(chan, 2);
+			if (ret)
+				return ret;
+
+			BEGIN_RING(chan, NvSubM2MF, 0x021c, 1);
+			OUT_RING  (chan, 1);
+		}
+
+		ret = RING_SPACE(chan, 14);
 		if (ret)
 			return ret;
-		BEGIN_RING(chan, NvSubM2MF, 0x0200, 1);
-		OUT_RING(chan, 1);
-		BEGIN_RING(chan, NvSubM2MF, 0x021c, 1);
-		OUT_RING(chan, 1);
+
+		BEGIN_RING(chan, NvSubM2MF, 0x0238, 2);
+		OUT_RING  (chan, upper_32_bits(src_offset));
+		OUT_RING  (chan, upper_32_bits(dst_offset));
+		BEGIN_RING(chan, NvSubM2MF, 0x030c, 8);
+		OUT_RING  (chan, lower_32_bits(src_offset));
+		OUT_RING  (chan, lower_32_bits(dst_offset));
+		OUT_RING  (chan, stride);
+		OUT_RING  (chan, stride);
+		OUT_RING  (chan, stride);
+		OUT_RING  (chan, height);
+		OUT_RING  (chan, 0x00000101);
+		OUT_RING  (chan, 0x00000000);
+		BEGIN_RING(chan, NvSubM2MF, NV_MEMORY_TO_MEMORY_FORMAT_NOP, 1);
+		OUT_RING  (chan, 0);
+
+		length -= amount;
+		src_offset += amount;
+		dst_offset += amount;
 	}
 
+	return 0;
+}
+
+static int
+nv04_bo_move_m2mf(struct nouveau_channel *chan, struct ttm_buffer_object *bo,
+		  struct ttm_mem_reg *old_mem, struct ttm_mem_reg *new_mem)
+{
+	u32 src_offset = old_mem->mm_node->start << PAGE_SHIFT;
+	u32 dst_offset = new_mem->mm_node->start << PAGE_SHIFT;
+	u32 page_count = new_mem->num_pages;
+	int ret;
+
+	ret = RING_SPACE(chan, 3);
+	if (ret)
+		return ret;
+
+	BEGIN_RING(chan, NvSubM2MF, NV_MEMORY_TO_MEMORY_FORMAT_DMA_SOURCE, 2);
+	OUT_RING  (chan, nouveau_bo_mem_ctxdma(bo, chan, old_mem));
+	OUT_RING  (chan, nouveau_bo_mem_ctxdma(bo, chan, new_mem));
+
 	page_count = new_mem->num_pages;
 	while (page_count) {
 		int line_count = (page_count > 2047) ? 2047 : page_count;
 
-		if (dev_priv->card_type >= NV_50) {
-			ret = RING_SPACE(chan, 3);
-			if (ret)
-				return ret;
-			BEGIN_RING(chan, NvSubM2MF, 0x0238, 2);
-			OUT_RING(chan, upper_32_bits(src_offset));
-			OUT_RING(chan, upper_32_bits(dst_offset));
-		}
 		ret = RING_SPACE(chan, 11);
 		if (ret)
 			return ret;
+
 		BEGIN_RING(chan, NvSubM2MF,
 				 NV_MEMORY_TO_MEMORY_FORMAT_OFFSET_IN, 8);
-		OUT_RING(chan, lower_32_bits(src_offset));
-		OUT_RING(chan, lower_32_bits(dst_offset));
-		OUT_RING(chan, PAGE_SIZE); /* src_pitch */
-		OUT_RING(chan, PAGE_SIZE); /* dst_pitch */
-		OUT_RING(chan, PAGE_SIZE); /* line_length */
-		OUT_RING(chan, line_count);
-		OUT_RING(chan, (1<<8)|(1<<0));
-		OUT_RING(chan, 0);
+		OUT_RING  (chan, src_offset);
+		OUT_RING  (chan, dst_offset);
+		OUT_RING  (chan, PAGE_SIZE); /* src_pitch */
+		OUT_RING  (chan, PAGE_SIZE); /* dst_pitch */
+		OUT_RING  (chan, PAGE_SIZE); /* line_length */
+		OUT_RING  (chan, line_count);
+		OUT_RING  (chan, 0x00000101);
+		OUT_RING  (chan, 0x00000000);
 		BEGIN_RING(chan, NvSubM2MF, NV_MEMORY_TO_MEMORY_FORMAT_NOP, 1);
-		OUT_RING(chan, 0);
+		OUT_RING  (chan, 0);
 
 		page_count -= line_count;
 		src_offset += (PAGE_SIZE * line_count);
 		dst_offset += (PAGE_SIZE * line_count);
 	}
 
+	return 0;
+}
+
+static int
+nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, bool intr,
+		     bool no_wait_reserve, bool no_wait_gpu,
+		     struct ttm_mem_reg *new_mem)
+{
+	struct drm_nouveau_private *dev_priv = nouveau_bdev(bo->bdev);
+	struct nouveau_bo *nvbo = nouveau_bo(bo);
+	struct nouveau_channel *chan;
+	int ret;
+
+	chan = nvbo->channel;
+	if (!chan || nvbo->no_vm)
+		chan = dev_priv->channel;
+
+	if (dev_priv->card_type < NV_50)
+		ret = nv04_bo_move_m2mf(chan, bo, &bo->mem, new_mem);
+	else
+		ret = nv50_bo_move_m2mf(chan, bo, &bo->mem, new_mem);
+	if (ret)
+		return ret;
+
 	return nouveau_bo_move_accel_cleanup(chan, nvbo, evict, no_wait_reserve, no_wait_gpu, new_mem);
 }
 
@@ -725,13 +816,6 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool evict, bool intr,
 	if (ret)
 		return ret;
 
-	/* Software copy if the card isn't up and running yet. */
-	if (dev_priv->init_state != NOUVEAU_CARD_INIT_DONE ||
-	    !dev_priv->channel) {
-		ret = ttm_bo_move_memcpy(bo, evict, no_wait_reserve, no_wait_gpu, new_mem);
-		goto out;
-	}
-
 	/* Fake bo copy. */
 	if (old_mem->mem_type == TTM_PL_SYSTEM && !bo->ttm) {
 		BUG_ON(bo->mem.mm_node != NULL);
@@ -740,6 +824,12 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool evict, bool intr,
 		goto out;
 	}
 
+	/* Software copy if the card isn't up and running yet. */
+	if (!dev_priv->channel) {
+		ret = ttm_bo_move_memcpy(bo, evict, no_wait_reserve, no_wait_gpu, new_mem);
+		goto out;
+	}
+
 	/* Hardware assisted copy. */
 	if (new_mem->mem_type == TTM_PL_SYSTEM)
 		ret = nouveau_bo_move_flipd(bo, evict, intr, no_wait_reserve, no_wait_gpu, new_mem);
@@ -815,7 +905,26 @@ nouveau_ttm_io_mem_free(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem)
 static int
 nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
 {
-	return 0;
+	struct drm_nouveau_private *dev_priv = nouveau_bdev(bo->bdev);
+	struct nouveau_bo *nvbo = nouveau_bo(bo);
+
+	/* as long as the bo isn't in vram, and isn't tiled, we've got
+	 * nothing to do here.
+	 */
+	if (bo->mem.mem_type != TTM_PL_VRAM) {
+		if (dev_priv->card_type < NV_50 || !nvbo->tile_flags)
+			return 0;
+	}
+
+	/* make sure bo is in mappable vram */
+	if (bo->mem.mm_node->start + bo->mem.num_pages < dev_priv->fb_mappable_pages)
+		return 0;
+
+
+	nvbo->placement.fpfn = 0;
+	nvbo->placement.lpfn = dev_priv->fb_mappable_pages;
+	nouveau_bo_placement_set(nvbo, TTM_PL_VRAM, 0);
+	return ttm_bo_validate(bo, &nvbo->placement, false, true, false);
 }
 
 struct ttm_bo_driver nouveau_bo_driver = {
diff --git a/drivers/gpu/drm/nouveau/nouveau_calc.c b/drivers/gpu/drm/nouveau/nouveau_calc.c
index 88f9bc0..23d9896 100644
--- a/drivers/gpu/drm/nouveau/nouveau_calc.c
+++ b/drivers/gpu/drm/nouveau/nouveau_calc.c
@@ -200,7 +200,7 @@ nv04_update_arb(struct drm_device *dev, int VClk, int bpp,
 	struct nv_sim_state sim_data;
 	int MClk = nouveau_hw_get_clock(dev, MPLL);
 	int NVClk = nouveau_hw_get_clock(dev, NVPLL);
-	uint32_t cfg1 = nvReadFB(dev, NV_PFB_CFG1);
+	uint32_t cfg1 = nvReadFB(dev, NV04_PFB_CFG1);
 
 	sim_data.pclk_khz = VClk;
 	sim_data.mclk_khz = MClk;
@@ -218,7 +218,7 @@ nv04_update_arb(struct drm_device *dev, int VClk, int bpp,
 		sim_data.mem_latency = 3;
 		sim_data.mem_page_miss = 10;
 	} else {
-		sim_data.memory_type = nvReadFB(dev, NV_PFB_CFG0) & 0x1;
+		sim_data.memory_type = nvReadFB(dev, NV04_PFB_CFG0) & 0x1;
 		sim_data.memory_width = (nvReadEXTDEV(dev, NV_PEXTDEV_BOOT_0) & 0x10) ? 128 : 64;
 		sim_data.mem_latency = cfg1 & 0xf;
 		sim_data.mem_page_miss = ((cfg1 >> 4) & 0xf) + ((cfg1 >> 31) & 0x1);
@@ -234,7 +234,7 @@ nv04_update_arb(struct drm_device *dev, int VClk, int bpp,
 }
 
 static void
-nv30_update_arb(int *burst, int *lwm)
+nv20_update_arb(int *burst, int *lwm)
 {
 	unsigned int fifo_size, burst_size, graphics_lwm;
 
@@ -251,14 +251,14 @@ nouveau_calc_arb(struct drm_device *dev, int vclk, int bpp, int *burst, int *lwm
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
-	if (dev_priv->card_type < NV_30)
+	if (dev_priv->card_type < NV_20)
 		nv04_update_arb(dev, vclk, bpp, burst, lwm);
 	else if ((dev->pci_device & 0xfff0) == 0x0240 /*CHIPSET_C51*/ ||
 		 (dev->pci_device & 0xfff0) == 0x03d0 /*CHIPSET_C512*/) {
 		*burst = 128;
 		*lwm = 0x0480;
 	} else
-		nv30_update_arb(burst, lwm);
+		nv20_update_arb(burst, lwm);
 }
 
 static int
diff --git a/drivers/gpu/drm/nouveau/nouveau_channel.c b/drivers/gpu/drm/nouveau/nouveau_channel.c
index 1fc57ef..53c2a6f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_channel.c
+++ b/drivers/gpu/drm/nouveau/nouveau_channel.c
@@ -69,14 +69,8 @@ nouveau_channel_pushbuf_ctxdma_init(struct nouveau_channel *chan)
 		chan->pushbuf_base = pb->bo.mem.mm_node->start << PAGE_SHIFT;
 	}
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, 0, pushbuf, &chan->pushbuf);
-	if (ret) {
-		NV_ERROR(dev, "Error referencing pushbuf ctxdma: %d\n", ret);
-		if (pushbuf != dev_priv->gart_info.sg_ctxdma)
-			nouveau_gpuobj_del(dev, &pushbuf);
-		return ret;
-	}
-
+	nouveau_gpuobj_ref(pushbuf, &chan->pushbuf);
+	nouveau_gpuobj_ref(NULL, &pushbuf);
 	return 0;
 }
 
@@ -257,9 +251,7 @@ nouveau_channel_free(struct nouveau_channel *chan)
 	nouveau_debugfs_channel_fini(chan);
 
 	/* Give outstanding push buffers a chance to complete */
-	spin_lock_irqsave(&chan->fence.lock, flags);
 	nouveau_fence_update(chan);
-	spin_unlock_irqrestore(&chan->fence.lock, flags);
 	if (chan->fence.sequence != chan->fence.sequence_ack) {
 		struct nouveau_fence *fence = NULL;
 
@@ -309,8 +301,9 @@ nouveau_channel_free(struct nouveau_channel *chan)
 	spin_unlock_irqrestore(&dev_priv->context_switch_lock, flags);
 
 	/* Release the channel's resources */
-	nouveau_gpuobj_ref_del(dev, &chan->pushbuf);
+	nouveau_gpuobj_ref(NULL, &chan->pushbuf);
 	if (chan->pushbuf_bo) {
+		nouveau_bo_unmap(chan->pushbuf_bo);
 		nouveau_bo_unpin(chan->pushbuf_bo);
 		nouveau_bo_ref(NULL, &chan->pushbuf_bo);
 	}
@@ -368,8 +361,6 @@ nouveau_ioctl_fifo_alloc(struct drm_device *dev, void *data,
 	struct nouveau_channel *chan;
 	int ret;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	if (dev_priv->engine.graph.accel_blocked)
 		return -ENODEV;
 
@@ -418,7 +409,6 @@ nouveau_ioctl_fifo_free(struct drm_device *dev, void *data,
 	struct drm_nouveau_channel_free *cfree = data;
 	struct nouveau_channel *chan;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
 	NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(cfree->channel, file_priv, chan);
 
 	nouveau_channel_free(chan);
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 149ed22..46584c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -37,12 +37,6 @@
 #include "nouveau_connector.h"
 #include "nouveau_hw.h"
 
-static inline struct drm_encoder_slave_funcs *
-get_slave_funcs(struct nouveau_encoder *enc)
-{
-	return to_encoder_slave(to_drm_encoder(enc))->slave_funcs;
-}
-
 static struct nouveau_encoder *
 find_encoder_by_type(struct drm_connector *connector, int type)
 {
@@ -82,6 +76,22 @@ nouveau_encoder_connector_get(struct nouveau_encoder *encoder)
 	return NULL;
 }
 
+/*TODO: This could use improvement, and learn to handle the fixed
+ *      BIOS tables etc.  It's fine currently, for its only user.
+ */
+int
+nouveau_connector_bpp(struct drm_connector *connector)
+{
+	struct nouveau_connector *nv_connector = nouveau_connector(connector);
+
+	if (nv_connector->edid && nv_connector->edid->revision >= 4) {
+		u8 bpc = ((nv_connector->edid->input & 0x70) >> 3) + 4;
+		if (bpc > 4)
+			return bpc;
+	}
+
+	return 18;
+}
 
 static void
 nouveau_connector_destroy(struct drm_connector *drm_connector)
@@ -102,60 +112,12 @@ nouveau_connector_destroy(struct drm_connector *drm_connector)
 	kfree(drm_connector);
 }
 
-static void
-nouveau_connector_ddc_prepare(struct drm_connector *connector, int *flags)
-{
-	struct drm_nouveau_private *dev_priv = connector->dev->dev_private;
-
-	if (dev_priv->card_type >= NV_50)
-		return;
-
-	*flags = 0;
-	if (NVLockVgaCrtcs(dev_priv->dev, false))
-		*flags |= 1;
-	if (nv_heads_tied(dev_priv->dev))
-		*flags |= 2;
-
-	if (*flags & 2)
-		NVSetOwner(dev_priv->dev, 0); /* necessary? */
-}
-
-static void
-nouveau_connector_ddc_finish(struct drm_connector *connector, int flags)
-{
-	struct drm_nouveau_private *dev_priv = connector->dev->dev_private;
-
-	if (dev_priv->card_type >= NV_50)
-		return;
-
-	if (flags & 2)
-		NVSetOwner(dev_priv->dev, 4);
-	if (flags & 1)
-		NVLockVgaCrtcs(dev_priv->dev, true);
-}
-
 static struct nouveau_i2c_chan *
 nouveau_connector_ddc_detect(struct drm_connector *connector,
 			     struct nouveau_encoder **pnv_encoder)
 {
 	struct drm_device *dev = connector->dev;
-	uint8_t out_buf[] = { 0x0, 0x0}, buf[2];
-	int ret, flags, i;
-
-	struct i2c_msg msgs[] = {
-		{
-			.addr = 0x50,
-			.flags = 0,
-			.len = 1,
-			.buf = out_buf,
-		},
-		{
-			.addr = 0x50,
-			.flags = I2C_M_RD,
-			.len = 1,
-			.buf = buf,
-		}
-	};
+	int i;
 
 	for (i = 0; i < DRM_CONNECTOR_MAX_ENCODER; i++) {
 		struct nouveau_i2c_chan *i2c = NULL;
@@ -174,14 +136,8 @@ nouveau_connector_ddc_detect(struct drm_connector *connector,
 
 		if (nv_encoder->dcb->i2c_index < 0xf)
 			i2c = nouveau_i2c_find(dev, nv_encoder->dcb->i2c_index);
-		if (!i2c)
-			continue;
-
-		nouveau_connector_ddc_prepare(connector, &flags);
-		ret = i2c_transfer(&i2c->adapter, msgs, 2);
-		nouveau_connector_ddc_finish(connector, flags);
 
-		if (ret == 2) {
+		if (i2c && nouveau_probe_i2c_addr(i2c, 0x50)) {
 			*pnv_encoder = nv_encoder;
 			return i2c;
 		}
@@ -190,6 +146,36 @@ nouveau_connector_ddc_detect(struct drm_connector *connector,
 	return NULL;
 }
 
+static struct nouveau_encoder *
+nouveau_connector_of_detect(struct drm_connector *connector)
+{
+#ifdef __powerpc__
+	struct drm_device *dev = connector->dev;
+	struct nouveau_connector *nv_connector = nouveau_connector(connector);
+	struct nouveau_encoder *nv_encoder;
+	struct device_node *cn, *dn = pci_device_to_OF_node(dev->pdev);
+
+	if (!dn ||
+	    !((nv_encoder = find_encoder_by_type(connector, OUTPUT_TMDS)) ||
+	      (nv_encoder = find_encoder_by_type(connector, OUTPUT_ANALOG))))
+		return NULL;
+
+	for_each_child_of_node(dn, cn) {
+		const char *name = of_get_property(cn, "name", NULL);
+		const void *edid = of_get_property(cn, "EDID", NULL);
+		int idx = name ? name[strlen(name) - 1] - 'A' : 0;
+
+		if (nv_encoder->dcb->i2c_index == idx && edid) {
+			nv_connector->edid =
+				kmemdup(edid, EDID_LENGTH, GFP_KERNEL);
+			of_node_put(cn);
+			return nv_encoder;
+		}
+	}
+#endif
+	return NULL;
+}
+
 static void
 nouveau_connector_set_encoder(struct drm_connector *connector,
 			      struct nouveau_encoder *nv_encoder)
@@ -234,21 +220,7 @@ nouveau_connector_detect(struct drm_connector *connector)
 	struct nouveau_connector *nv_connector = nouveau_connector(connector);
 	struct nouveau_encoder *nv_encoder = NULL;
 	struct nouveau_i2c_chan *i2c;
-	int type, flags;
-
-	if (nv_connector->dcb->type == DCB_CONNECTOR_LVDS)
-		nv_encoder = find_encoder_by_type(connector, OUTPUT_LVDS);
-	if (nv_encoder && nv_connector->native_mode) {
-		unsigned status = connector_status_connected;
-
-#if defined(CONFIG_ACPI_BUTTON) || \
-	(defined(CONFIG_ACPI_BUTTON_MODULE) && defined(MODULE))
-		if (!nouveau_ignorelid && !acpi_lid_open())
-			status = connector_status_unknown;
-#endif
-		nouveau_connector_set_encoder(connector, nv_encoder);
-		return status;
-	}
+	int type;
 
 	/* Cleanup the previous EDID block. */
 	if (nv_connector->edid) {
@@ -259,9 +231,7 @@ nouveau_connector_detect(struct drm_connector *connector)
 
 	i2c = nouveau_connector_ddc_detect(connector, &nv_encoder);
 	if (i2c) {
-		nouveau_connector_ddc_prepare(connector, &flags);
 		nv_connector->edid = drm_get_edid(connector, &i2c->adapter);
-		nouveau_connector_ddc_finish(connector, flags);
 		drm_mode_connector_update_edid_property(connector,
 							nv_connector->edid);
 		if (!nv_connector->edid) {
@@ -301,6 +271,12 @@ nouveau_connector_detect(struct drm_connector *connector)
 		return connector_status_connected;
 	}
 
+	nv_encoder = nouveau_connector_of_detect(connector);
+	if (nv_encoder) {
+		nouveau_connector_set_encoder(connector, nv_encoder);
+		return connector_status_connected;
+	}
+
 detect_analog:
 	nv_encoder = find_encoder_by_type(connector, OUTPUT_ANALOG);
 	if (!nv_encoder && !nouveau_tv_disable)
@@ -321,6 +297,85 @@ detect_analog:
 	return connector_status_disconnected;
 }
 
+static enum drm_connector_status
+nouveau_connector_detect_lvds(struct drm_connector *connector)
+{
+	struct drm_device *dev = connector->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_connector *nv_connector = nouveau_connector(connector);
+	struct nouveau_encoder *nv_encoder = NULL;
+	enum drm_connector_status status = connector_status_disconnected;
+
+	/* Cleanup the previous EDID block. */
+	if (nv_connector->edid) {
+		drm_mode_connector_update_edid_property(connector, NULL);
+		kfree(nv_connector->edid);
+		nv_connector->edid = NULL;
+	}
+
+	nv_encoder = find_encoder_by_type(connector, OUTPUT_LVDS);
+	if (!nv_encoder)
+		return connector_status_disconnected;
+
+	/* Try retrieving EDID via DDC */
+	if (!dev_priv->vbios.fp_no_ddc) {
+		status = nouveau_connector_detect(connector);
+		if (status == connector_status_connected)
+			goto out;
+	}
+
+	/* On some laptops (Sony, i'm looking at you) there appears to
+	 * be no direct way of accessing the panel's EDID.  The only
+	 * option available to us appears to be to ask ACPI for help..
+	 *
+	 * It's important this check's before trying straps, one of the
+	 * said manufacturer's laptops are configured in such a way
+	 * the nouveau decides an entry in the VBIOS FP mode table is
+	 * valid - it's not (rh#613284)
+	 */
+	if (nv_encoder->dcb->lvdsconf.use_acpi_for_edid) {
+		if (!nouveau_acpi_edid(dev, connector)) {
+			status = connector_status_connected;
+			goto out;
+		}
+	}
+
+	/* If no EDID found above, and the VBIOS indicates a hardcoded
+	 * modeline is avalilable for the panel, set it as the panel's
+	 * native mode and exit.
+	 */
+	if (nouveau_bios_fp_mode(dev, NULL) && (dev_priv->vbios.fp_no_ddc ||
+	    nv_encoder->dcb->lvdsconf.use_straps_for_mode)) {
+		status = connector_status_connected;
+		goto out;
+	}
+
+	/* Still nothing, some VBIOS images have a hardcoded EDID block
+	 * stored for the panel stored in them.
+	 */
+	if (!dev_priv->vbios.fp_no_ddc) {
+		struct edid *edid =
+			(struct edid *)nouveau_bios_embedded_edid(dev);
+		if (edid) {
+			nv_connector->edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
+			*(nv_connector->edid) = *edid;
+			status = connector_status_connected;
+		}
+	}
+
+out:
+#if defined(CONFIG_ACPI_BUTTON) || \
+	(defined(CONFIG_ACPI_BUTTON_MODULE) && defined(MODULE))
+	if (status == connector_status_connected &&
+	    !nouveau_ignorelid && !acpi_lid_open())
+		status = connector_status_unknown;
+#endif
+
+	drm_mode_connector_update_edid_property(connector, nv_connector->edid);
+	nouveau_connector_set_encoder(connector, nv_encoder);
+	return status;
+}
+
 static void
 nouveau_connector_force(struct drm_connector *connector)
 {
@@ -353,6 +408,7 @@ nouveau_connector_set_property(struct drm_connector *connector,
 {
 	struct nouveau_connector *nv_connector = nouveau_connector(connector);
 	struct nouveau_encoder *nv_encoder = nv_connector->detected_encoder;
+	struct drm_encoder *encoder = to_drm_encoder(nv_encoder);
 	struct drm_device *dev = connector->dev;
 	int ret;
 
@@ -425,8 +481,8 @@ nouveau_connector_set_property(struct drm_connector *connector,
 	}
 
 	if (nv_encoder && nv_encoder->dcb->type == OUTPUT_TV)
-		return get_slave_funcs(nv_encoder)->
-			set_property(to_drm_encoder(nv_encoder), connector, property, value);
+		return get_slave_funcs(encoder)->set_property(
+			encoder, connector, property, value);
 
 	return -EINVAL;
 }
@@ -441,7 +497,8 @@ nouveau_connector_native_mode(struct drm_connector *connector)
 	int high_w = 0, high_h = 0, high_v = 0;
 
 	list_for_each_entry(mode, &nv_connector->base.probed_modes, head) {
-		if (helper->mode_valid(connector, mode) != MODE_OK)
+		if (helper->mode_valid(connector, mode) != MODE_OK ||
+		    (mode->flags & DRM_MODE_FLAG_INTERLACE))
 			continue;
 
 		/* Use preferred mode if there is one.. */
@@ -534,21 +591,30 @@ static int
 nouveau_connector_get_modes(struct drm_connector *connector)
 {
 	struct drm_device *dev = connector->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_connector *nv_connector = nouveau_connector(connector);
 	struct nouveau_encoder *nv_encoder = nv_connector->detected_encoder;
+	struct drm_encoder *encoder = to_drm_encoder(nv_encoder);
 	int ret = 0;
 
-	/* If we're not LVDS, destroy the previous native mode, the attached
-	 * monitor could have changed.
+	/* destroy the native mode, the attached monitor could have changed.
 	 */
-	if (nv_connector->dcb->type != DCB_CONNECTOR_LVDS &&
-	    nv_connector->native_mode) {
+	if (nv_connector->native_mode) {
 		drm_mode_destroy(dev, nv_connector->native_mode);
 		nv_connector->native_mode = NULL;
 	}
 
 	if (nv_connector->edid)
 		ret = drm_add_edid_modes(connector, nv_connector->edid);
+	else
+	if (nv_encoder->dcb->type == OUTPUT_LVDS &&
+	    (nv_encoder->dcb->lvdsconf.use_straps_for_mode ||
+	     dev_priv->vbios.fp_no_ddc) && nouveau_bios_fp_mode(dev, NULL)) {
+		struct drm_display_mode mode;
+
+		nouveau_bios_fp_mode(dev, &mode);
+		nv_connector->native_mode = drm_mode_duplicate(dev, &mode);
+	}
 
 	/* Find the native mode if this is a digital panel, if we didn't
 	 * find any modes through DDC previously add the native mode to
@@ -566,10 +632,10 @@ nouveau_connector_get_modes(struct drm_connector *connector)
 	}
 
 	if (nv_encoder->dcb->type == OUTPUT_TV)
-		ret = get_slave_funcs(nv_encoder)->
-			get_modes(to_drm_encoder(nv_encoder), connector);
+		ret = get_slave_funcs(encoder)->get_modes(encoder, connector);
 
-	if (nv_encoder->dcb->type == OUTPUT_LVDS)
+	if (nv_connector->dcb->type == DCB_CONNECTOR_LVDS ||
+	    nv_connector->dcb->type == DCB_CONNECTOR_eDP)
 		ret += nouveau_connector_scaler_modes_add(connector);
 
 	return ret;
@@ -582,6 +648,7 @@ nouveau_connector_mode_valid(struct drm_connector *connector,
 	struct drm_nouveau_private *dev_priv = connector->dev->dev_private;
 	struct nouveau_connector *nv_connector = nouveau_connector(connector);
 	struct nouveau_encoder *nv_encoder = nv_connector->detected_encoder;
+	struct drm_encoder *encoder = to_drm_encoder(nv_encoder);
 	unsigned min_clock = 25000, max_clock = min_clock;
 	unsigned clock = mode->clock;
 
@@ -608,15 +675,14 @@ nouveau_connector_mode_valid(struct drm_connector *connector,
 			max_clock = 350000;
 		break;
 	case OUTPUT_TV:
-		return get_slave_funcs(nv_encoder)->
-			mode_valid(to_drm_encoder(nv_encoder), mode);
+		return get_slave_funcs(encoder)->mode_valid(encoder, mode);
 	case OUTPUT_DP:
 		if (nv_encoder->dp.link_bw == DP_LINK_BW_2_7)
 			max_clock = nv_encoder->dp.link_nr * 270000;
 		else
 			max_clock = nv_encoder->dp.link_nr * 162000;
 
-		clock *= 3;
+		clock = clock * nouveau_connector_bpp(connector) / 8;
 		break;
 	default:
 		BUG_ON(1);
@@ -643,6 +709,44 @@ nouveau_connector_best_encoder(struct drm_connector *connector)
 	return NULL;
 }
 
+void
+nouveau_connector_set_polling(struct drm_connector *connector)
+{
+	struct drm_device *dev = connector->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct drm_crtc *crtc;
+	bool spare_crtc = false;
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
+		spare_crtc |= !crtc->enabled;
+
+	connector->polled = 0;
+
+	switch (connector->connector_type) {
+	case DRM_MODE_CONNECTOR_VGA:
+	case DRM_MODE_CONNECTOR_TV:
+		if (dev_priv->card_type >= NV_50 ||
+		    (nv_gf4_disp_arch(dev) && spare_crtc))
+			connector->polled = DRM_CONNECTOR_POLL_CONNECT;
+		break;
+
+	case DRM_MODE_CONNECTOR_DVII:
+	case DRM_MODE_CONNECTOR_DVID:
+	case DRM_MODE_CONNECTOR_HDMIA:
+	case DRM_MODE_CONNECTOR_DisplayPort:
+	case DRM_MODE_CONNECTOR_eDP:
+		if (dev_priv->card_type >= NV_50)
+			connector->polled = DRM_CONNECTOR_POLL_HPD;
+		else if (connector->connector_type == DRM_MODE_CONNECTOR_DVID ||
+			 spare_crtc)
+			connector->polled = DRM_CONNECTOR_POLL_CONNECT;
+		break;
+
+	default:
+		break;
+	}
+}
+
 static const struct drm_connector_helper_funcs
 nouveau_connector_helper_funcs = {
 	.get_modes = nouveau_connector_get_modes,
@@ -662,148 +766,74 @@ nouveau_connector_funcs = {
 	.force = nouveau_connector_force
 };
 
-static int
-nouveau_connector_create_lvds(struct drm_device *dev,
-			      struct drm_connector *connector)
-{
-	struct nouveau_connector *nv_connector = nouveau_connector(connector);
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_i2c_chan *i2c = NULL;
-	struct nouveau_encoder *nv_encoder;
-	struct drm_display_mode native, *mode, *temp;
-	bool dummy, if_is_24bit = false;
-	int ret, flags;
-
-	nv_encoder = find_encoder_by_type(connector, OUTPUT_LVDS);
-	if (!nv_encoder)
-		return -ENODEV;
-
-	ret = nouveau_bios_parse_lvds_table(dev, 0, &dummy, &if_is_24bit);
-	if (ret) {
-		NV_ERROR(dev, "Error parsing LVDS table, disabling LVDS\n");
-		return ret;
-	}
-	nv_connector->use_dithering = !if_is_24bit;
-
-	/* Firstly try getting EDID over DDC, if allowed and I2C channel
-	 * is available.
-	 */
-	if (!dev_priv->vbios.fp_no_ddc && nv_encoder->dcb->i2c_index < 0xf)
-		i2c = nouveau_i2c_find(dev, nv_encoder->dcb->i2c_index);
-
-	if (i2c) {
-		nouveau_connector_ddc_prepare(connector, &flags);
-		nv_connector->edid = drm_get_edid(connector, &i2c->adapter);
-		nouveau_connector_ddc_finish(connector, flags);
-	}
-
-	/* If no EDID found above, and the VBIOS indicates a hardcoded
-	 * modeline is avalilable for the panel, set it as the panel's
-	 * native mode and exit.
-	 */
-	if (!nv_connector->edid && nouveau_bios_fp_mode(dev, &native) &&
-	     (nv_encoder->dcb->lvdsconf.use_straps_for_mode ||
-	      dev_priv->vbios.fp_no_ddc)) {
-		nv_connector->native_mode = drm_mode_duplicate(dev, &native);
-		goto out;
-	}
-
-	/* Still nothing, some VBIOS images have a hardcoded EDID block
-	 * stored for the panel stored in them.
-	 */
-	if (!nv_connector->edid && !nv_connector->native_mode &&
-	    !dev_priv->vbios.fp_no_ddc) {
-		struct edid *edid =
-			(struct edid *)nouveau_bios_embedded_edid(dev);
-		if (edid) {
-			nv_connector->edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
-			*(nv_connector->edid) = *edid;
-		}
-	}
-
-	if (!nv_connector->edid)
-		goto out;
-
-	/* We didn't find/use a panel mode from the VBIOS, so parse the EDID
-	 * block and look for the preferred mode there.
-	 */
-	ret = drm_add_edid_modes(connector, nv_connector->edid);
-	if (ret == 0)
-		goto out;
-	nv_connector->detected_encoder = nv_encoder;
-	nv_connector->native_mode = nouveau_connector_native_mode(connector);
-	list_for_each_entry_safe(mode, temp, &connector->probed_modes, head)
-		drm_mode_remove(connector, mode);
-
-out:
-	if (!nv_connector->native_mode) {
-		NV_ERROR(dev, "LVDS present in DCB table, but couldn't "
-			      "determine its native mode.  Disabling.\n");
-		return -ENODEV;
-	}
-
-	drm_mode_connector_update_edid_property(connector, nv_connector->edid);
-	return 0;
-}
+static const struct drm_connector_funcs
+nouveau_connector_funcs_lvds = {
+	.dpms = drm_helper_connector_dpms,
+	.save = NULL,
+	.restore = NULL,
+	.detect = nouveau_connector_detect_lvds,
+	.destroy = nouveau_connector_destroy,
+	.fill_modes = drm_helper_probe_single_connector_modes,
+	.set_property = nouveau_connector_set_property,
+	.force = nouveau_connector_force
+};
 
-int
-nouveau_connector_create(struct drm_device *dev,
-			 struct dcb_connector_table_entry *dcb)
+struct drm_connector *
+nouveau_connector_create(struct drm_device *dev, int index)
 {
+	const struct drm_connector_funcs *funcs = &nouveau_connector_funcs;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_connector *nv_connector = NULL;
+	struct dcb_connector_table_entry *dcb = NULL;
 	struct drm_connector *connector;
-	struct drm_encoder *encoder;
-	int ret, type;
+	int type, ret = 0;
 
 	NV_DEBUG_KMS(dev, "\n");
 
+	if (index >= dev_priv->vbios.dcb.connector.entries)
+		return ERR_PTR(-EINVAL);
+
+	dcb = &dev_priv->vbios.dcb.connector.entry[index];
+	if (dcb->drm)
+		return dcb->drm;
+
 	switch (dcb->type) {
-	case DCB_CONNECTOR_NONE:
-		return 0;
 	case DCB_CONNECTOR_VGA:
-		NV_INFO(dev, "Detected a VGA connector\n");
 		type = DRM_MODE_CONNECTOR_VGA;
 		break;
 	case DCB_CONNECTOR_TV_0:
 	case DCB_CONNECTOR_TV_1:
 	case DCB_CONNECTOR_TV_3:
-		NV_INFO(dev, "Detected a TV connector\n");
 		type = DRM_MODE_CONNECTOR_TV;
 		break;
 	case DCB_CONNECTOR_DVI_I:
-		NV_INFO(dev, "Detected a DVI-I connector\n");
 		type = DRM_MODE_CONNECTOR_DVII;
 		break;
 	case DCB_CONNECTOR_DVI_D:
-		NV_INFO(dev, "Detected a DVI-D connector\n");
 		type = DRM_MODE_CONNECTOR_DVID;
 		break;
 	case DCB_CONNECTOR_HDMI_0:
 	case DCB_CONNECTOR_HDMI_1:
-		NV_INFO(dev, "Detected a HDMI connector\n");
 		type = DRM_MODE_CONNECTOR_HDMIA;
 		break;
 	case DCB_CONNECTOR_LVDS:
-		NV_INFO(dev, "Detected a LVDS connector\n");
 		type = DRM_MODE_CONNECTOR_LVDS;
+		funcs = &nouveau_connector_funcs_lvds;
 		break;
 	case DCB_CONNECTOR_DP:
-		NV_INFO(dev, "Detected a DisplayPort connector\n");
 		type = DRM_MODE_CONNECTOR_DisplayPort;
 		break;
 	case DCB_CONNECTOR_eDP:
-		NV_INFO(dev, "Detected an eDP connector\n");
 		type = DRM_MODE_CONNECTOR_eDP;
 		break;
 	default:
 		NV_ERROR(dev, "unknown connector type: 0x%02x!!\n", dcb->type);
-		return -EINVAL;
+		return ERR_PTR(-EINVAL);
 	}
 
 	nv_connector = kzalloc(sizeof(*nv_connector), GFP_KERNEL);
 	if (!nv_connector)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	nv_connector->dcb = dcb;
 	connector = &nv_connector->base;
 
@@ -811,27 +841,21 @@ nouveau_connector_create(struct drm_device *dev,
 	connector->interlace_allowed = false;
 	connector->doublescan_allowed = false;
 
-	drm_connector_init(dev, connector, &nouveau_connector_funcs, type);
+	drm_connector_init(dev, connector, funcs, type);
 	drm_connector_helper_add(connector, &nouveau_connector_helper_funcs);
 
-	/* attach encoders */
-	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
-		struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-
-		if (nv_encoder->dcb->connector != dcb->index)
-			continue;
-
-		if (get_slave_funcs(nv_encoder))
-			get_slave_funcs(nv_encoder)->create_resources(encoder, connector);
+	/* Check if we need dithering enabled */
+	if (dcb->type == DCB_CONNECTOR_LVDS) {
+		bool dummy, is_24bit = false;
 
-		drm_mode_connector_attach_encoder(connector, encoder);
-	}
+		ret = nouveau_bios_parse_lvds_table(dev, 0, &dummy, &is_24bit);
+		if (ret) {
+			NV_ERROR(dev, "Error parsing LVDS table, disabling "
+				 "LVDS\n");
+			goto fail;
+		}
 
-	if (!connector->encoder_ids[0]) {
-		NV_WARN(dev, "  no encoders, ignoring\n");
-		drm_connector_cleanup(connector);
-		kfree(connector);
-		return 0;
+		nv_connector->use_dithering = !is_24bit;
 	}
 
 	/* Init DVI-I specific properties */
@@ -841,12 +865,8 @@ nouveau_connector_create(struct drm_device *dev,
 		drm_connector_attach_property(connector, dev->mode_config.dvi_i_select_subconnector_property, 0);
 	}
 
-	if (dcb->type != DCB_CONNECTOR_LVDS)
-		nv_connector->use_dithering = false;
-
 	switch (dcb->type) {
 	case DCB_CONNECTOR_VGA:
-		connector->polled = DRM_CONNECTOR_POLL_CONNECT;
 		if (dev_priv->card_type >= NV_50) {
 			drm_connector_attach_property(connector,
 					dev->mode_config.scaling_mode_property,
@@ -858,17 +878,6 @@ nouveau_connector_create(struct drm_device *dev,
 	case DCB_CONNECTOR_TV_3:
 		nv_connector->scaling_mode = DRM_MODE_SCALE_NONE;
 		break;
-	case DCB_CONNECTOR_DP:
-	case DCB_CONNECTOR_eDP:
-	case DCB_CONNECTOR_HDMI_0:
-	case DCB_CONNECTOR_HDMI_1:
-	case DCB_CONNECTOR_DVI_I:
-	case DCB_CONNECTOR_DVI_D:
-		if (dev_priv->card_type >= NV_50)
-			connector->polled = DRM_CONNECTOR_POLL_HPD;
-		else
-			connector->polled = DRM_CONNECTOR_POLL_CONNECT;
-		/* fall-through */
 	default:
 		nv_connector->scaling_mode = DRM_MODE_SCALE_FULLSCREEN;
 
@@ -882,15 +891,15 @@ nouveau_connector_create(struct drm_device *dev,
 		break;
 	}
 
+	nouveau_connector_set_polling(connector);
+
 	drm_sysfs_connector_add(connector);
+	dcb->drm = connector;
+	return dcb->drm;
 
-	if (dcb->type == DCB_CONNECTOR_LVDS) {
-		ret = nouveau_connector_create_lvds(dev, connector);
-		if (ret) {
-			connector->funcs->destroy(connector);
-			return ret;
-		}
-	}
+fail:
+	drm_connector_cleanup(connector);
+	kfree(connector);
+	return ERR_PTR(ret);
 
-	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h
index 4ef38ab..c21ed6b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.h
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.h
@@ -49,7 +49,13 @@ static inline struct nouveau_connector *nouveau_connector(
 	return container_of(con, struct nouveau_connector, base);
 }
 
-int nouveau_connector_create(struct drm_device *,
-			     struct dcb_connector_table_entry *);
+struct drm_connector *
+nouveau_connector_create(struct drm_device *, int index);
+
+void
+nouveau_connector_set_polling(struct drm_connector *);
+
+int
+nouveau_connector_bpp(struct drm_connector *);
 
 #endif /* __NOUVEAU_CONNECTOR_H__ */
diff --git a/drivers/gpu/drm/nouveau/nouveau_debugfs.c b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
index 7933de4..8e15923 100644
--- a/drivers/gpu/drm/nouveau/nouveau_debugfs.c
+++ b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
@@ -157,7 +157,23 @@ nouveau_debugfs_vbios_image(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int
+nouveau_debugfs_evict_vram(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_nouveau_private *dev_priv = node->minor->dev->dev_private;
+	int ret;
+
+	ret = ttm_bo_evict_mm(&dev_priv->ttm.bdev, TTM_PL_VRAM);
+	if (ret)
+		seq_printf(m, "failed: %d", ret);
+	else
+		seq_printf(m, "succeeded\n");
+	return 0;
+}
+
 static struct drm_info_list nouveau_debugfs_list[] = {
+	{ "evict_vram", nouveau_debugfs_evict_vram, 0, NULL },
 	{ "chipset", nouveau_debugfs_chipset_info, 0, NULL },
 	{ "memory", nouveau_debugfs_memory_info, 0, NULL },
 	{ "vbios.rom", nouveau_debugfs_vbios_image, 0, NULL },
diff --git a/drivers/gpu/drm/nouveau/nouveau_dma.c b/drivers/gpu/drm/nouveau/nouveau_dma.c
index 65c441a..eb24e2b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dma.c
@@ -28,6 +28,7 @@
 #include "drm.h"
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
+#include "nouveau_ramht.h"
 
 void
 nouveau_dma_pre_init(struct nouveau_channel *chan)
@@ -58,26 +59,27 @@ nouveau_dma_init(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *m2mf = NULL;
-	struct nouveau_gpuobj *nvsw = NULL;
+	struct nouveau_gpuobj *obj = NULL;
 	int ret, i;
 
 	/* Create NV_MEMORY_TO_MEMORY_FORMAT for buffer moves */
 	ret = nouveau_gpuobj_gr_new(chan, dev_priv->card_type < NV_50 ?
-				    0x0039 : 0x5039, &m2mf);
+				    0x0039 : 0x5039, &obj);
 	if (ret)
 		return ret;
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, NvM2MF, m2mf, NULL);
+	ret = nouveau_ramht_insert(chan, NvM2MF, obj);
+	nouveau_gpuobj_ref(NULL, &obj);
 	if (ret)
 		return ret;
 
 	/* Create an NV_SW object for various sync purposes */
-	ret = nouveau_gpuobj_sw_new(chan, NV_SW, &nvsw);
+	ret = nouveau_gpuobj_sw_new(chan, NV_SW, &obj);
 	if (ret)
 		return ret;
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, NvSw, nvsw, NULL);
+	ret = nouveau_ramht_insert(chan, NvSw, obj);
+	nouveau_gpuobj_ref(NULL, &obj);
 	if (ret)
 		return ret;
 
@@ -91,13 +93,6 @@ nouveau_dma_init(struct nouveau_channel *chan)
 	if (ret)
 		return ret;
 
-	/* Map M2MF notifier object - fbcon. */
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		ret = nouveau_bo_map(chan->notifier_bo);
-		if (ret)
-			return ret;
-	}
-
 	/* Insert NOPS for NOUVEAU_DMA_SKIPS */
 	ret = RING_SPACE(chan, NOUVEAU_DMA_SKIPS);
 	if (ret)
@@ -219,7 +214,7 @@ nv50_dma_push_wait(struct nouveau_channel *chan, int count)
 
 		chan->dma.ib_free = get - chan->dma.ib_put;
 		if (chan->dma.ib_free <= 0)
-			chan->dma.ib_free += chan->dma.ib_max + 1;
+			chan->dma.ib_free += chan->dma.ib_max;
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/nouveau/nouveau_dp.c b/drivers/gpu/drm/nouveau/nouveau_dp.c
index deeb21c..4562f30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dp.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dp.c
@@ -23,8 +23,10 @@
  */
 
 #include "drmP.h"
+
 #include "nouveau_drv.h"
 #include "nouveau_i2c.h"
+#include "nouveau_connector.h"
 #include "nouveau_encoder.h"
 
 static int
@@ -270,13 +272,39 @@ bool
 nouveau_dp_link_train(struct drm_encoder *encoder)
 {
 	struct drm_device *dev = encoder->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_gpio_engine *pgpio = &dev_priv->engine.gpio;
 	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-	uint8_t config[4];
-	uint8_t status[3];
+	struct nouveau_connector *nv_connector;
+	struct bit_displayport_encoder_table *dpe;
+	int dpe_headerlen;
+	uint8_t config[4], status[3];
 	bool cr_done, cr_max_vs, eq_done;
 	int ret = 0, i, tries, voltage;
 
 	NV_DEBUG_KMS(dev, "link training!!\n");
+
+	nv_connector = nouveau_encoder_connector_get(nv_encoder);
+	if (!nv_connector)
+		return false;
+
+	dpe = nouveau_bios_dp_table(dev, nv_encoder->dcb, &dpe_headerlen);
+	if (!dpe) {
+		NV_ERROR(dev, "SOR-%d: no DP encoder table!\n", nv_encoder->or);
+		return false;
+	}
+
+	/* disable hotplug detect, this flips around on some panels during
+	 * link training.
+	 */
+	pgpio->irq_enable(dev, nv_connector->dcb->gpio_tag, false);
+
+	if (dpe->script0) {
+		NV_DEBUG_KMS(dev, "SOR-%d: running DP script 0\n", nv_encoder->or);
+		nouveau_bios_run_init_table(dev, le16_to_cpu(dpe->script0),
+					    nv_encoder->dcb);
+	}
+
 train:
 	cr_done = eq_done = false;
 
@@ -289,7 +317,8 @@ train:
 		return false;
 
 	config[0] = nv_encoder->dp.link_nr;
-	if (nv_encoder->dp.dpcd_version >= 0x11)
+	if (nv_encoder->dp.dpcd_version >= 0x11 &&
+	    nv_encoder->dp.enhanced_frame)
 		config[0] |= DP_LANE_COUNT_ENHANCED_FRAME_EN;
 
 	ret = nouveau_dp_lane_count_set(encoder, config[0]);
@@ -403,6 +432,15 @@ stop:
 		}
 	}
 
+	if (dpe->script1) {
+		NV_DEBUG_KMS(dev, "SOR-%d: running DP script 1\n", nv_encoder->or);
+		nouveau_bios_run_init_table(dev, le16_to_cpu(dpe->script1),
+					    nv_encoder->dcb);
+	}
+
+	/* re-enable hotplug detect */
+	pgpio->irq_enable(dev, nv_connector->dcb->gpio_tag, true);
+
 	return eq_done;
 }
 
@@ -431,10 +469,12 @@ nouveau_dp_detect(struct drm_encoder *encoder)
 	    !nv_encoder->dcb->dpconf.link_bw)
 		nv_encoder->dp.link_bw = DP_LINK_BW_1_62;
 
-	nv_encoder->dp.link_nr = dpcd[2] & 0xf;
+	nv_encoder->dp.link_nr = dpcd[2] & DP_MAX_LANE_COUNT_MASK;
 	if (nv_encoder->dp.link_nr > nv_encoder->dcb->dpconf.link_nr)
 		nv_encoder->dp.link_nr = nv_encoder->dcb->dpconf.link_nr;
 
+	nv_encoder->dp.enhanced_frame = (dpcd[2] & DP_ENHANCED_FRAME_CAP);
+
 	return true;
 }
 
@@ -487,7 +527,8 @@ nouveau_dp_auxch(struct nouveau_i2c_chan *auxch, int cmd, int addr,
 		nv_wr32(dev, NV50_AUXCH_CTRL(index), ctrl | 0x80000000);
 		nv_wr32(dev, NV50_AUXCH_CTRL(index), ctrl);
 		nv_wr32(dev, NV50_AUXCH_CTRL(index), ctrl | 0x00010000);
-		if (!nv_wait(NV50_AUXCH_CTRL(index), 0x00010000, 0x00000000)) {
+		if (!nv_wait(dev, NV50_AUXCH_CTRL(index),
+			     0x00010000, 0x00000000)) {
 			NV_ERROR(dev, "expected bit 16 == 0, got 0x%08x\n",
 				 nv_rd32(dev, NV50_AUXCH_CTRL(index)));
 			ret = -EBUSY;
@@ -535,47 +576,64 @@ out:
 	return ret ? ret : (stat & NV50_AUXCH_STAT_REPLY);
 }
 
-int
-nouveau_dp_i2c_aux_ch(struct i2c_adapter *adapter, int mode,
-		      uint8_t write_byte, uint8_t *read_byte)
+static int
+nouveau_dp_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 {
-	struct i2c_algo_dp_aux_data *algo_data = adapter->algo_data;
-	struct nouveau_i2c_chan *auxch = (struct nouveau_i2c_chan *)adapter;
+	struct nouveau_i2c_chan *auxch = (struct nouveau_i2c_chan *)adap;
 	struct drm_device *dev = auxch->dev;
-	int ret = 0, cmd, addr = algo_data->address;
-	uint8_t *buf;
-
-	if (mode == MODE_I2C_READ) {
-		cmd = AUX_I2C_READ;
-		buf = read_byte;
-	} else {
-		cmd = (mode & MODE_I2C_READ) ? AUX_I2C_READ : AUX_I2C_WRITE;
-		buf = &write_byte;
-	}
+	struct i2c_msg *msg = msgs;
+	int ret, mcnt = num;
 
-	if (!(mode & MODE_I2C_STOP))
-		cmd |= AUX_I2C_MOT;
+	while (mcnt--) {
+		u8 remaining = msg->len;
+		u8 *ptr = msg->buf;
 
-	if (mode & MODE_I2C_START)
-		return 1;
+		while (remaining) {
+			u8 cnt = (remaining > 16) ? 16 : remaining;
+			u8 cmd;
 
-	for (;;) {
-		ret = nouveau_dp_auxch(auxch, cmd, addr, buf, 1);
-		if (ret < 0)
-			return ret;
-
-		switch (ret & NV50_AUXCH_STAT_REPLY_I2C) {
-		case NV50_AUXCH_STAT_REPLY_I2C_ACK:
-			return 1;
-		case NV50_AUXCH_STAT_REPLY_I2C_NACK:
-			return -EREMOTEIO;
-		case NV50_AUXCH_STAT_REPLY_I2C_DEFER:
-			udelay(100);
-			break;
-		default:
-			NV_ERROR(dev, "invalid auxch status: 0x%08x\n", ret);
-			return -EREMOTEIO;
+			if (msg->flags & I2C_M_RD)
+				cmd = AUX_I2C_READ;
+			else
+				cmd = AUX_I2C_WRITE;
+
+			if (mcnt || remaining > 16)
+				cmd |= AUX_I2C_MOT;
+
+			ret = nouveau_dp_auxch(auxch, cmd, msg->addr, ptr, cnt);
+			if (ret < 0)
+				return ret;
+
+			switch (ret & NV50_AUXCH_STAT_REPLY_I2C) {
+			case NV50_AUXCH_STAT_REPLY_I2C_ACK:
+				break;
+			case NV50_AUXCH_STAT_REPLY_I2C_NACK:
+				return -EREMOTEIO;
+			case NV50_AUXCH_STAT_REPLY_I2C_DEFER:
+				udelay(100);
+				continue;
+			default:
+				NV_ERROR(dev, "bad auxch reply: 0x%08x\n", ret);
+				return -EREMOTEIO;
+			}
+
+			ptr += cnt;
+			remaining -= cnt;
 		}
+
+		msg++;
 	}
+
+	return num;
+}
+
+static u32
+nouveau_dp_i2c_func(struct i2c_adapter *adap)
+{
+	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
 }
 
+const struct i2c_algorithm nouveau_dp_i2c_algo = {
+	.master_xfer = nouveau_dp_i2c_xfer,
+	.functionality = nouveau_dp_i2c_func
+};
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.c b/drivers/gpu/drm/nouveau/nouveau_drv.c
index 2737704..ee2442f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.c
@@ -35,13 +35,9 @@
 
 #include "drm_pciids.h"
 
-MODULE_PARM_DESC(ctxfw, "Use external firmware blob for grctx init (NV40)");
-int nouveau_ctxfw = 0;
-module_param_named(ctxfw, nouveau_ctxfw, int, 0400);
-
-MODULE_PARM_DESC(noagp, "Disable AGP");
-int nouveau_noagp;
-module_param_named(noagp, nouveau_noagp, int, 0400);
+MODULE_PARM_DESC(agpmode, "AGP mode (0 to disable AGP)");
+int nouveau_agpmode = -1;
+module_param_named(agpmode, nouveau_agpmode, int, 0400);
 
 MODULE_PARM_DESC(modeset, "Enable kernel modesetting");
 static int nouveau_modeset = -1; /* kms */
@@ -56,7 +52,7 @@ int nouveau_vram_pushbuf;
 module_param_named(vram_pushbuf, nouveau_vram_pushbuf, int, 0400);
 
 MODULE_PARM_DESC(vram_notify, "Force DMA notifiers to be in VRAM");
-int nouveau_vram_notify = 1;
+int nouveau_vram_notify = 0;
 module_param_named(vram_notify, nouveau_vram_notify, int, 0400);
 
 MODULE_PARM_DESC(duallink, "Allow dual-link TMDS (>=GeForce 8)");
@@ -83,6 +79,10 @@ MODULE_PARM_DESC(nofbaccel, "Disable fbcon acceleration");
 int nouveau_nofbaccel = 0;
 module_param_named(nofbaccel, nouveau_nofbaccel, int, 0400);
 
+MODULE_PARM_DESC(force_post, "Force POST");
+int nouveau_force_post = 0;
+module_param_named(force_post, nouveau_force_post, int, 0400);
+
 MODULE_PARM_DESC(override_conntype, "Ignore DCB connector type");
 int nouveau_override_conntype = 0;
 module_param_named(override_conntype, nouveau_override_conntype, int, 0400);
@@ -155,9 +155,6 @@ nouveau_pci_suspend(struct pci_dev *pdev, pm_message_t pm_state)
 	struct drm_crtc *crtc;
 	int ret, i;
 
-	if (!drm_core_check_feature(dev, DRIVER_MODESET))
-		return -ENODEV;
-
 	if (pm_state.event == PM_EVENT_PRETHAW)
 		return 0;
 
@@ -257,9 +254,6 @@ nouveau_pci_resume(struct pci_dev *pdev)
 	struct drm_crtc *crtc;
 	int ret, i;
 
-	if (!drm_core_check_feature(dev, DRIVER_MODESET))
-		return -ENODEV;
-
 	nouveau_fbcon_save_disable_accel(dev);
 
 	NV_INFO(dev, "We're back, enabling device...\n");
@@ -269,6 +263,13 @@ nouveau_pci_resume(struct pci_dev *pdev)
 		return -1;
 	pci_set_master(dev->pdev);
 
+	/* Make sure the AGP controller is in a consistent state */
+	if (dev_priv->gart_info.type == NOUVEAU_GART_AGP)
+		nouveau_mem_reset_agp(dev);
+
+	/* Make the CRTCs accessible */
+	engine->display.early_init(dev);
+
 	NV_INFO(dev, "POSTing device...\n");
 	ret = nouveau_run_vbios_init(dev);
 	if (ret)
@@ -323,7 +324,6 @@ nouveau_pci_resume(struct pci_dev *pdev)
 
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
 		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		int ret;
 
 		ret = nouveau_bo_pin(nv_crtc->cursor.nvbo, TTM_PL_FLAG_VRAM);
 		if (!ret)
@@ -332,11 +332,7 @@ nouveau_pci_resume(struct pci_dev *pdev)
 			NV_ERROR(dev, "Could not pin/map cursor.\n");
 	}
 
-	if (dev_priv->card_type < NV_50) {
-		nv04_display_restore(dev);
-		NVLockVgaCrtcs(dev, false);
-	} else
-		nv50_display_init(dev);
+	engine->display.init(dev);
 
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
 		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
@@ -371,7 +367,8 @@ nouveau_pci_resume(struct pci_dev *pdev)
 static struct drm_driver driver = {
 	.driver_features =
 		DRIVER_USE_AGP | DRIVER_PCI_DMA | DRIVER_SG |
-		DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM,
+		DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
+		DRIVER_MODESET,
 	.load = nouveau_load,
 	.firstopen = nouveau_firstopen,
 	.lastclose = nouveau_lastclose,
@@ -438,16 +435,18 @@ static int __init nouveau_init(void)
 			nouveau_modeset = 1;
 	}
 
-	if (nouveau_modeset == 1) {
-		driver.driver_features |= DRIVER_MODESET;
-		nouveau_register_dsm_handler();
-	}
+	if (!nouveau_modeset)
+		return 0;
 
+	nouveau_register_dsm_handler();
 	return drm_init(&driver);
 }
 
 static void __exit nouveau_exit(void)
 {
+	if (!nouveau_modeset)
+		return;
+
 	drm_exit(&driver);
 	nouveau_unregister_dsm_handler();
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 8be2f59..be53e92 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -123,14 +123,6 @@ nvbo_kmap_obj_iovirtual(struct nouveau_bo *nvbo)
 	return ioptr;
 }
 
-struct mem_block {
-	struct mem_block *next;
-	struct mem_block *prev;
-	uint64_t start;
-	uint64_t size;
-	struct drm_file *file_priv; /* NULL: free, -1: heap, other: real files */
-};
-
 enum nouveau_flags {
 	NV_NFORCE   = 0x10000000,
 	NV_NFORCE2  = 0x20000000
@@ -141,22 +133,24 @@ enum nouveau_flags {
 #define NVOBJ_ENGINE_DISPLAY	2
 #define NVOBJ_ENGINE_INT	0xdeadbeef
 
-#define NVOBJ_FLAG_ALLOW_NO_REFS	(1 << 0)
 #define NVOBJ_FLAG_ZERO_ALLOC		(1 << 1)
 #define NVOBJ_FLAG_ZERO_FREE		(1 << 2)
-#define NVOBJ_FLAG_FAKE			(1 << 3)
 struct nouveau_gpuobj {
+	struct drm_device *dev;
+	struct kref refcount;
 	struct list_head list;
 
-	struct nouveau_channel *im_channel;
-	struct mem_block *im_pramin;
+	struct drm_mm_node *im_pramin;
 	struct nouveau_bo *im_backing;
-	uint32_t im_backing_start;
 	uint32_t *im_backing_suspend;
 	int im_bound;
 
 	uint32_t flags;
-	int refcount;
+
+	u32 size;
+	u32 pinst;
+	u32 cinst;
+	u64 vinst;
 
 	uint32_t engine;
 	uint32_t class;
@@ -165,16 +159,6 @@ struct nouveau_gpuobj {
 	void *priv;
 };
 
-struct nouveau_gpuobj_ref {
-	struct list_head list;
-
-	struct nouveau_gpuobj *gpuobj;
-	uint32_t instance;
-
-	struct nouveau_channel *channel;
-	int handle;
-};
-
 struct nouveau_channel {
 	struct drm_device *dev;
 	int id;
@@ -196,37 +180,36 @@ struct nouveau_channel {
 		struct list_head pending;
 		uint32_t sequence;
 		uint32_t sequence_ack;
-		uint32_t last_sequence_irq;
+		atomic_t last_sequence_irq;
 	} fence;
 
 	/* DMA push buffer */
-	struct nouveau_gpuobj_ref *pushbuf;
-	struct nouveau_bo         *pushbuf_bo;
-	uint32_t                   pushbuf_base;
+	struct nouveau_gpuobj *pushbuf;
+	struct nouveau_bo     *pushbuf_bo;
+	uint32_t               pushbuf_base;
 
 	/* Notifier memory */
 	struct nouveau_bo *notifier_bo;
-	struct mem_block *notifier_heap;
+	struct drm_mm notifier_heap;
 
 	/* PFIFO context */
-	struct nouveau_gpuobj_ref *ramfc;
-	struct nouveau_gpuobj_ref *cache;
+	struct nouveau_gpuobj *ramfc;
+	struct nouveau_gpuobj *cache;
 
 	/* PGRAPH context */
 	/* XXX may be merge 2 pointers as private data ??? */
-	struct nouveau_gpuobj_ref *ramin_grctx;
+	struct nouveau_gpuobj *ramin_grctx;
 	void *pgraph_ctx;
 
 	/* NV50 VM */
-	struct nouveau_gpuobj     *vm_pd;
-	struct nouveau_gpuobj_ref *vm_gart_pt;
-	struct nouveau_gpuobj_ref *vm_vram_pt[NV50_VM_VRAM_NR];
+	struct nouveau_gpuobj *vm_pd;
+	struct nouveau_gpuobj *vm_gart_pt;
+	struct nouveau_gpuobj *vm_vram_pt[NV50_VM_VRAM_NR];
 
 	/* Objects */
-	struct nouveau_gpuobj_ref *ramin; /* Private instmem */
-	struct mem_block          *ramin_heap; /* Private PRAMIN heap */
-	struct nouveau_gpuobj_ref *ramht; /* Hash table */
-	struct list_head           ramht_refs; /* Objects referenced by RAMHT */
+	struct nouveau_gpuobj *ramin; /* Private instmem */
+	struct drm_mm          ramin_heap; /* Private PRAMIN heap */
+	struct nouveau_ramht  *ramht; /* Hash table */
 
 	/* GPU object info for stuff used in-kernel (mm_enabled) */
 	uint32_t m2mf_ntfy;
@@ -277,8 +260,7 @@ struct nouveau_instmem_engine {
 	void	(*clear)(struct drm_device *, struct nouveau_gpuobj *);
 	int	(*bind)(struct drm_device *, struct nouveau_gpuobj *);
 	int	(*unbind)(struct drm_device *, struct nouveau_gpuobj *);
-	void	(*prepare_access)(struct drm_device *, bool write);
-	void	(*finish_access)(struct drm_device *);
+	void	(*flush)(struct drm_device *);
 };
 
 struct nouveau_mc_engine {
@@ -303,17 +285,17 @@ struct nouveau_fb_engine {
 };
 
 struct nouveau_fifo_engine {
-	void *priv;
-
 	int  channels;
 
+	struct nouveau_gpuobj *playlist[2];
+	int cur_playlist;
+
 	int  (*init)(struct drm_device *);
 	void (*takedown)(struct drm_device *);
 
 	void (*disable)(struct drm_device *);
 	void (*enable)(struct drm_device *);
 	bool (*reassign)(struct drm_device *, bool enable);
-	bool (*cache_flush)(struct drm_device *dev);
 	bool (*cache_pull)(struct drm_device *dev, bool enable);
 
 	int  (*channel_id)(struct drm_device *);
@@ -339,10 +321,11 @@ struct nouveau_pgraph_object_class {
 struct nouveau_pgraph_engine {
 	struct nouveau_pgraph_object_class *grclass;
 	bool accel_blocked;
-	void *ctxprog;
-	void *ctxvals;
 	int grctx_size;
 
+	/* NV2x/NV3x context table (0x400780) */
+	struct nouveau_gpuobj *ctx_table;
+
 	int  (*init)(struct drm_device *);
 	void (*takedown)(struct drm_device *);
 
@@ -358,6 +341,24 @@ struct nouveau_pgraph_engine {
 				  uint32_t size, uint32_t pitch);
 };
 
+struct nouveau_display_engine {
+	int (*early_init)(struct drm_device *);
+	void (*late_takedown)(struct drm_device *);
+	int (*create)(struct drm_device *);
+	int (*init)(struct drm_device *);
+	void (*destroy)(struct drm_device *);
+};
+
+struct nouveau_gpio_engine {
+	int  (*init)(struct drm_device *);
+	void (*takedown)(struct drm_device *);
+
+	int  (*get)(struct drm_device *, enum dcb_gpio_tag);
+	int  (*set)(struct drm_device *, enum dcb_gpio_tag, int state);
+
+	void (*irq_enable)(struct drm_device *, enum dcb_gpio_tag, bool on);
+};
+
 struct nouveau_engine {
 	struct nouveau_instmem_engine instmem;
 	struct nouveau_mc_engine      mc;
@@ -365,6 +366,8 @@ struct nouveau_engine {
 	struct nouveau_fb_engine      fb;
 	struct nouveau_pgraph_engine  graph;
 	struct nouveau_fifo_engine    fifo;
+	struct nouveau_display_engine display;
+	struct nouveau_gpio_engine    gpio;
 };
 
 struct nouveau_pll_vals {
@@ -397,7 +400,7 @@ enum nv04_fp_display_regs {
 
 struct nv04_crtc_reg {
 	unsigned char MiscOutReg;     /* */
-	uint8_t CRTC[0x9f];
+	uint8_t CRTC[0xa0];
 	uint8_t CR58[0x10];
 	uint8_t Sequencer[5];
 	uint8_t Graphics[9];
@@ -496,15 +499,11 @@ enum nouveau_card_type {
 	NV_30      = 0x30,
 	NV_40      = 0x40,
 	NV_50      = 0x50,
+	NV_C0      = 0xc0,
 };
 
 struct drm_nouveau_private {
 	struct drm_device *dev;
-	enum {
-		NOUVEAU_CARD_INIT_DOWN,
-		NOUVEAU_CARD_INIT_DONE,
-		NOUVEAU_CARD_INIT_FAILED
-	} init_state;
 
 	/* the card type, takes NV_* as values */
 	enum nouveau_card_type card_type;
@@ -513,8 +512,14 @@ struct drm_nouveau_private {
 	int flags;
 
 	void __iomem *mmio;
+
+	spinlock_t ramin_lock;
 	void __iomem *ramin;
-	uint32_t ramin_size;
+	u32 ramin_size;
+	u32 ramin_base;
+	bool ramin_available;
+	struct drm_mm ramin_heap;
+	struct list_head gpuobj_list;
 
 	struct nouveau_bo *vga_ram;
 
@@ -528,13 +533,9 @@ struct drm_nouveau_private {
 		struct ttm_global_reference mem_global_ref;
 		struct ttm_bo_global_ref bo_global_ref;
 		struct ttm_bo_device bdev;
-		spinlock_t bo_list_lock;
-		struct list_head bo_list;
 		atomic_t validate_sequence;
 	} ttm;
 
-	struct fb_info *fbdev_info;
-
 	int fifo_alloc_count;
 	struct nouveau_channel *fifos[NOUVEAU_MAX_CHANNEL_NR];
 
@@ -545,15 +546,11 @@ struct drm_nouveau_private {
 	spinlock_t context_switch_lock;
 
 	/* RAMIN configuration, RAMFC, RAMHT and RAMRO offsets */
-	struct nouveau_gpuobj *ramht;
+	struct nouveau_ramht  *ramht;
+	struct nouveau_gpuobj *ramfc;
+	struct nouveau_gpuobj *ramro;
+
 	uint32_t ramin_rsvd_vram;
-	uint32_t ramht_offset;
-	uint32_t ramht_size;
-	uint32_t ramht_bits;
-	uint32_t ramfc_offset;
-	uint32_t ramfc_size;
-	uint32_t ramro_offset;
-	uint32_t ramro_size;
 
 	struct {
 		enum {
@@ -571,14 +568,12 @@ struct drm_nouveau_private {
 	} gart_info;
 
 	/* nv10-nv40 tiling regions */
-	struct {
-		struct nouveau_tile_reg reg[NOUVEAU_MAX_TILE_NR];
-		spinlock_t lock;
-	} tile;
+	struct nouveau_tile_reg tile[NOUVEAU_MAX_TILE_NR];
 
 	/* VRAM/fb configuration */
 	uint64_t vram_size;
 	uint64_t vram_sys_base;
+	u32 vram_rblock_size;
 
 	uint64_t fb_phys;
 	uint64_t fb_available_size;
@@ -595,14 +590,6 @@ struct drm_nouveau_private {
 	struct nouveau_gpuobj *vm_vram_pt[NV50_VM_VRAM_NR];
 	int vm_vram_pt_nr;
 
-	struct mem_block *ramin_heap;
-
-	/* context table pointed to be NV_PGRAPH_CHANNEL_CTX_TABLE (0x400780) */
-	uint32_t ctx_table_size;
-	struct nouveau_gpuobj_ref *ctx_table;
-
-	struct list_head gpuobj_list;
-
 	struct nvbios vbios;
 
 	struct nv04_mode_state mode_reg;
@@ -618,6 +605,11 @@ struct drm_nouveau_private {
 	struct backlight_device *backlight;
 
 	struct nouveau_channel *evo;
+	struct {
+		struct dcb_entry *dcb;
+		u16 script;
+		u32 pclk;
+	} evo_irq;
 
 	struct {
 		struct dentry *channel_root;
@@ -652,14 +644,6 @@ nouveau_bo_ref(struct nouveau_bo *ref, struct nouveau_bo **pnvbo)
 	return 0;
 }
 
-#define NOUVEAU_CHECK_INITIALISED_WITH_RETURN do {            \
-	struct drm_nouveau_private *nv = dev->dev_private;    \
-	if (nv->init_state != NOUVEAU_CARD_INIT_DONE) {       \
-		NV_ERROR(dev, "called without init\n");       \
-		return -EINVAL;                               \
-	}                                                     \
-} while (0)
-
 #define NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(id, cl, ch) do {    \
 	struct drm_nouveau_private *nv = dev->dev_private;       \
 	if (!nouveau_channel_owner(dev, (cl), (id))) {           \
@@ -671,7 +655,7 @@ nouveau_bo_ref(struct nouveau_bo *ref, struct nouveau_bo **pnvbo)
 } while (0)
 
 /* nouveau_drv.c */
-extern int nouveau_noagp;
+extern int nouveau_agpmode;
 extern int nouveau_duallink;
 extern int nouveau_uscript_lvds;
 extern int nouveau_uscript_tmds;
@@ -682,10 +666,10 @@ extern int nouveau_tv_disable;
 extern char *nouveau_tv_norm;
 extern int nouveau_reg_debug;
 extern char *nouveau_vbios;
-extern int nouveau_ctxfw;
 extern int nouveau_ignorelid;
 extern int nouveau_nofbaccel;
 extern int nouveau_noaccel;
+extern int nouveau_force_post;
 extern int nouveau_override_conntype;
 
 extern int nouveau_pci_suspend(struct pci_dev *pdev, pm_message_t pm_state);
@@ -707,17 +691,12 @@ extern bool nouveau_wait_for_idle(struct drm_device *);
 extern int  nouveau_card_init(struct drm_device *);
 
 /* nouveau_mem.c */
-extern int  nouveau_mem_init_heap(struct mem_block **, uint64_t start,
-				 uint64_t size);
-extern struct mem_block *nouveau_mem_alloc_block(struct mem_block *,
-						 uint64_t size, int align2,
-						 struct drm_file *, int tail);
-extern void nouveau_mem_takedown(struct mem_block **heap);
-extern void nouveau_mem_free_block(struct mem_block *);
-extern int  nouveau_mem_detect(struct drm_device *dev);
-extern void nouveau_mem_release(struct drm_file *, struct mem_block *heap);
-extern int  nouveau_mem_init(struct drm_device *);
+extern int  nouveau_mem_vram_init(struct drm_device *);
+extern void nouveau_mem_vram_fini(struct drm_device *);
+extern int  nouveau_mem_gart_init(struct drm_device *);
+extern void nouveau_mem_gart_fini(struct drm_device *);
 extern int  nouveau_mem_init_agp(struct drm_device *);
+extern int  nouveau_mem_reset_agp(struct drm_device *);
 extern void nouveau_mem_close(struct drm_device *);
 extern struct nouveau_tile_reg *nv10_mem_set_tiling(struct drm_device *dev,
 						    uint32_t addr,
@@ -759,7 +738,6 @@ extern void nouveau_channel_free(struct nouveau_channel *);
 extern int  nouveau_gpuobj_early_init(struct drm_device *);
 extern int  nouveau_gpuobj_init(struct drm_device *);
 extern void nouveau_gpuobj_takedown(struct drm_device *);
-extern void nouveau_gpuobj_late_takedown(struct drm_device *);
 extern int  nouveau_gpuobj_suspend(struct drm_device *dev);
 extern void nouveau_gpuobj_suspend_cleanup(struct drm_device *dev);
 extern void nouveau_gpuobj_resume(struct drm_device *dev);
@@ -769,24 +747,11 @@ extern void nouveau_gpuobj_channel_takedown(struct nouveau_channel *);
 extern int nouveau_gpuobj_new(struct drm_device *, struct nouveau_channel *,
 			      uint32_t size, int align, uint32_t flags,
 			      struct nouveau_gpuobj **);
-extern int nouveau_gpuobj_del(struct drm_device *, struct nouveau_gpuobj **);
-extern int nouveau_gpuobj_ref_add(struct drm_device *, struct nouveau_channel *,
-				  uint32_t handle, struct nouveau_gpuobj *,
-				  struct nouveau_gpuobj_ref **);
-extern int nouveau_gpuobj_ref_del(struct drm_device *,
-				  struct nouveau_gpuobj_ref **);
-extern int nouveau_gpuobj_ref_find(struct nouveau_channel *, uint32_t handle,
-				   struct nouveau_gpuobj_ref **ref_ret);
-extern int nouveau_gpuobj_new_ref(struct drm_device *,
-				  struct nouveau_channel *alloc_chan,
-				  struct nouveau_channel *ref_chan,
-				  uint32_t handle, uint32_t size, int align,
-				  uint32_t flags, struct nouveau_gpuobj_ref **);
-extern int nouveau_gpuobj_new_fake(struct drm_device *,
-				   uint32_t p_offset, uint32_t b_offset,
-				   uint32_t size, uint32_t flags,
-				   struct nouveau_gpuobj **,
-				   struct nouveau_gpuobj_ref**);
+extern void nouveau_gpuobj_ref(struct nouveau_gpuobj *,
+			       struct nouveau_gpuobj **);
+extern int nouveau_gpuobj_new_fake(struct drm_device *, u32 pinst, u64 vinst,
+				   u32 size, u32 flags,
+				   struct nouveau_gpuobj **);
 extern int nouveau_gpuobj_dma_new(struct nouveau_channel *, int class,
 				  uint64_t offset, uint64_t size, int access,
 				  int target, struct nouveau_gpuobj **);
@@ -857,11 +822,13 @@ void nouveau_register_dsm_handler(void);
 void nouveau_unregister_dsm_handler(void);
 int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len);
 bool nouveau_acpi_rom_supported(struct pci_dev *pdev);
+int nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
 #else
 static inline void nouveau_register_dsm_handler(void) {}
 static inline void nouveau_unregister_dsm_handler(void) {}
 static inline bool nouveau_acpi_rom_supported(struct pci_dev *pdev) { return false; }
 static inline int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len) { return -EINVAL; }
+static inline int nouveau_acpi_edid(struct drm_device *dev, struct drm_connector *connector) { return -EINVAL; }
 #endif
 
 /* nouveau_backlight.c */
@@ -924,22 +891,29 @@ extern void nv10_fb_takedown(struct drm_device *);
 extern void nv10_fb_set_region_tiling(struct drm_device *, int, uint32_t,
 				      uint32_t, uint32_t);
 
+/* nv30_fb.c */
+extern int  nv30_fb_init(struct drm_device *);
+extern void nv30_fb_takedown(struct drm_device *);
+
 /* nv40_fb.c */
 extern int  nv40_fb_init(struct drm_device *);
 extern void nv40_fb_takedown(struct drm_device *);
 extern void nv40_fb_set_region_tiling(struct drm_device *, int, uint32_t,
 				      uint32_t, uint32_t);
-
 /* nv50_fb.c */
 extern int  nv50_fb_init(struct drm_device *);
 extern void nv50_fb_takedown(struct drm_device *);
+extern void nv50_fb_vm_trap(struct drm_device *, int display, const char *);
+
+/* nvc0_fb.c */
+extern int  nvc0_fb_init(struct drm_device *);
+extern void nvc0_fb_takedown(struct drm_device *);
 
 /* nv04_fifo.c */
 extern int  nv04_fifo_init(struct drm_device *);
 extern void nv04_fifo_disable(struct drm_device *);
 extern void nv04_fifo_enable(struct drm_device *);
 extern bool nv04_fifo_reassign(struct drm_device *, bool);
-extern bool nv04_fifo_cache_flush(struct drm_device *);
 extern bool nv04_fifo_cache_pull(struct drm_device *, bool);
 extern int  nv04_fifo_channel_id(struct drm_device *);
 extern int  nv04_fifo_create_context(struct nouveau_channel *);
@@ -971,6 +945,19 @@ extern void nv50_fifo_destroy_context(struct nouveau_channel *);
 extern int  nv50_fifo_load_context(struct nouveau_channel *);
 extern int  nv50_fifo_unload_context(struct drm_device *);
 
+/* nvc0_fifo.c */
+extern int  nvc0_fifo_init(struct drm_device *);
+extern void nvc0_fifo_takedown(struct drm_device *);
+extern void nvc0_fifo_disable(struct drm_device *);
+extern void nvc0_fifo_enable(struct drm_device *);
+extern bool nvc0_fifo_reassign(struct drm_device *, bool);
+extern bool nvc0_fifo_cache_pull(struct drm_device *, bool);
+extern int  nvc0_fifo_channel_id(struct drm_device *);
+extern int  nvc0_fifo_create_context(struct nouveau_channel *);
+extern void nvc0_fifo_destroy_context(struct nouveau_channel *);
+extern int  nvc0_fifo_load_context(struct nouveau_channel *);
+extern int  nvc0_fifo_unload_context(struct drm_device *);
+
 /* nv04_graph.c */
 extern struct nouveau_pgraph_object_class nv04_graph_grclass[];
 extern int  nv04_graph_init(struct drm_device *);
@@ -1035,11 +1022,15 @@ extern int  nv50_graph_unload_context(struct drm_device *);
 extern void nv50_graph_context_switch(struct drm_device *);
 extern int  nv50_grctx_init(struct nouveau_grctx *);
 
-/* nouveau_grctx.c */
-extern int  nouveau_grctx_prog_load(struct drm_device *);
-extern void nouveau_grctx_vals_load(struct drm_device *,
-				    struct nouveau_gpuobj *);
-extern void nouveau_grctx_fini(struct drm_device *);
+/* nvc0_graph.c */
+extern int  nvc0_graph_init(struct drm_device *);
+extern void nvc0_graph_takedown(struct drm_device *);
+extern void nvc0_graph_fifo_access(struct drm_device *, bool);
+extern struct nouveau_channel *nvc0_graph_channel(struct drm_device *);
+extern int  nvc0_graph_create_context(struct nouveau_channel *);
+extern void nvc0_graph_destroy_context(struct nouveau_channel *);
+extern int  nvc0_graph_load_context(struct nouveau_channel *);
+extern int  nvc0_graph_unload_context(struct drm_device *);
 
 /* nv04_instmem.c */
 extern int  nv04_instmem_init(struct drm_device *);
@@ -1051,8 +1042,7 @@ extern int  nv04_instmem_populate(struct drm_device *, struct nouveau_gpuobj *,
 extern void nv04_instmem_clear(struct drm_device *, struct nouveau_gpuobj *);
 extern int  nv04_instmem_bind(struct drm_device *, struct nouveau_gpuobj *);
 extern int  nv04_instmem_unbind(struct drm_device *, struct nouveau_gpuobj *);
-extern void nv04_instmem_prepare_access(struct drm_device *, bool write);
-extern void nv04_instmem_finish_access(struct drm_device *);
+extern void nv04_instmem_flush(struct drm_device *);
 
 /* nv50_instmem.c */
 extern int  nv50_instmem_init(struct drm_device *);
@@ -1064,8 +1054,21 @@ extern int  nv50_instmem_populate(struct drm_device *, struct nouveau_gpuobj *,
 extern void nv50_instmem_clear(struct drm_device *, struct nouveau_gpuobj *);
 extern int  nv50_instmem_bind(struct drm_device *, struct nouveau_gpuobj *);
 extern int  nv50_instmem_unbind(struct drm_device *, struct nouveau_gpuobj *);
-extern void nv50_instmem_prepare_access(struct drm_device *, bool write);
-extern void nv50_instmem_finish_access(struct drm_device *);
+extern void nv50_instmem_flush(struct drm_device *);
+extern void nv84_instmem_flush(struct drm_device *);
+extern void nv50_vm_flush(struct drm_device *, int engine);
+
+/* nvc0_instmem.c */
+extern int  nvc0_instmem_init(struct drm_device *);
+extern void nvc0_instmem_takedown(struct drm_device *);
+extern int  nvc0_instmem_suspend(struct drm_device *);
+extern void nvc0_instmem_resume(struct drm_device *);
+extern int  nvc0_instmem_populate(struct drm_device *, struct nouveau_gpuobj *,
+				  uint32_t *size);
+extern void nvc0_instmem_clear(struct drm_device *, struct nouveau_gpuobj *);
+extern int  nvc0_instmem_bind(struct drm_device *, struct nouveau_gpuobj *);
+extern int  nvc0_instmem_unbind(struct drm_device *, struct nouveau_gpuobj *);
+extern void nvc0_instmem_flush(struct drm_device *);
 
 /* nv04_mc.c */
 extern int  nv04_mc_init(struct drm_device *);
@@ -1088,13 +1091,14 @@ extern long nouveau_compat_ioctl(struct file *file, unsigned int cmd,
 				 unsigned long arg);
 
 /* nv04_dac.c */
-extern int nv04_dac_create(struct drm_device *dev, struct dcb_entry *entry);
+extern int nv04_dac_create(struct drm_connector *, struct dcb_entry *);
 extern uint32_t nv17_dac_sample_load(struct drm_encoder *encoder);
 extern int nv04_dac_output_offset(struct drm_encoder *encoder);
 extern void nv04_dac_update_dacclk(struct drm_encoder *encoder, bool enable);
+extern bool nv04_dac_in_use(struct drm_encoder *encoder);
 
 /* nv04_dfp.c */
-extern int nv04_dfp_create(struct drm_device *dev, struct dcb_entry *entry);
+extern int nv04_dfp_create(struct drm_connector *, struct dcb_entry *);
 extern int nv04_dfp_get_bound_head(struct drm_device *dev, struct dcb_entry *dcbent);
 extern void nv04_dfp_bind_head(struct drm_device *dev, struct dcb_entry *dcbent,
 			       int head, bool dl);
@@ -1103,15 +1107,17 @@ extern void nv04_dfp_update_fp_control(struct drm_encoder *encoder, int mode);
 
 /* nv04_tv.c */
 extern int nv04_tv_identify(struct drm_device *dev, int i2c_index);
-extern int nv04_tv_create(struct drm_device *dev, struct dcb_entry *entry);
+extern int nv04_tv_create(struct drm_connector *, struct dcb_entry *);
 
 /* nv17_tv.c */
-extern int nv17_tv_create(struct drm_device *dev, struct dcb_entry *entry);
+extern int nv17_tv_create(struct drm_connector *, struct dcb_entry *);
 
 /* nv04_display.c */
+extern int nv04_display_early_init(struct drm_device *);
+extern void nv04_display_late_takedown(struct drm_device *);
 extern int nv04_display_create(struct drm_device *);
+extern int nv04_display_init(struct drm_device *);
 extern void nv04_display_destroy(struct drm_device *);
-extern void nv04_display_restore(struct drm_device *);
 
 /* nv04_crtc.c */
 extern int nv04_crtc_create(struct drm_device *, int index);
@@ -1148,7 +1154,6 @@ extern int nouveau_fence_wait(void *obj, void *arg, bool lazy, bool intr);
 extern int nouveau_fence_flush(void *obj, void *arg);
 extern void nouveau_fence_unref(void **obj);
 extern void *nouveau_fence_ref(void *obj);
-extern void nouveau_fence_handler(struct drm_device *dev, int channel);
 
 /* nouveau_gem.c */
 extern int nouveau_gem_new(struct drm_device *, struct nouveau_channel *,
@@ -1168,13 +1173,15 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device *, void *,
 extern int nouveau_gem_ioctl_info(struct drm_device *, void *,
 				  struct drm_file *);
 
-/* nv17_gpio.c */
-int nv17_gpio_get(struct drm_device *dev, enum dcb_gpio_tag tag);
-int nv17_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state);
+/* nv10_gpio.c */
+int nv10_gpio_get(struct drm_device *dev, enum dcb_gpio_tag tag);
+int nv10_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state);
 
 /* nv50_gpio.c */
+int nv50_gpio_init(struct drm_device *dev);
 int nv50_gpio_get(struct drm_device *dev, enum dcb_gpio_tag tag);
 int nv50_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state);
+void nv50_gpio_irq_enable(struct drm_device *, enum dcb_gpio_tag, bool on);
 
 /* nv50_calc. */
 int nv50_calc_pll(struct drm_device *, struct pll_lims *, int clk,
@@ -1221,6 +1228,13 @@ static inline void nv_wr32(struct drm_device *dev, unsigned reg, u32 val)
 	iowrite32_native(val, dev_priv->mmio + reg);
 }
 
+static inline u32 nv_mask(struct drm_device *dev, u32 reg, u32 mask, u32 val)
+{
+	u32 tmp = nv_rd32(dev, reg);
+	nv_wr32(dev, reg, (tmp & ~mask) | val);
+	return tmp;
+}
+
 static inline u8 nv_rd08(struct drm_device *dev, unsigned reg)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
@@ -1233,7 +1247,7 @@ static inline void nv_wr08(struct drm_device *dev, unsigned reg, u8 val)
 	iowrite8(val, dev_priv->mmio + reg);
 }
 
-#define nv_wait(reg, mask, val) \
+#define nv_wait(dev, reg, mask, val) \
 	nouveau_wait_until(dev, 2000000000ULL, (reg), (mask), (val))
 
 /* PRAMIN access */
@@ -1250,17 +1264,8 @@ static inline void nv_wi32(struct drm_device *dev, unsigned offset, u32 val)
 }
 
 /* object access */
-static inline u32 nv_ro32(struct drm_device *dev, struct nouveau_gpuobj *obj,
-				unsigned index)
-{
-	return nv_ri32(dev, obj->im_pramin->start + index * 4);
-}
-
-static inline void nv_wo32(struct drm_device *dev, struct nouveau_gpuobj *obj,
-				unsigned index, u32 val)
-{
-	nv_wi32(dev, obj->im_pramin->start + index * 4, val);
-}
+extern u32 nv_ro32(struct nouveau_gpuobj *, u32 offset);
+extern void nv_wo32(struct nouveau_gpuobj *, u32 offset, u32 val);
 
 /*
  * Logging
@@ -1347,6 +1352,15 @@ nv_two_reg_pll(struct drm_device *dev)
 	return false;
 }
 
+static inline bool
+nv_match_device(struct drm_device *dev, unsigned device,
+		unsigned sub_vendor, unsigned sub_device)
+{
+	return dev->pdev->device == device &&
+		dev->pdev->subsystem_vendor == sub_vendor &&
+		dev->pdev->subsystem_device == sub_device;
+}
+
 #define NV_SW                                                        0x0000506e
 #define NV_SW_DMA_SEMAPHORE                                          0x00000060
 #define NV_SW_SEMAPHORE_OFFSET                                       0x00000064
diff --git a/drivers/gpu/drm/nouveau/nouveau_encoder.h b/drivers/gpu/drm/nouveau/nouveau_encoder.h
index e1df820..ae69b61 100644
--- a/drivers/gpu/drm/nouveau/nouveau_encoder.h
+++ b/drivers/gpu/drm/nouveau/nouveau_encoder.h
@@ -38,13 +38,15 @@ struct nouveau_encoder {
 	struct dcb_entry *dcb;
 	int or;
 
+	/* different to drm_encoder.crtc, this reflects what's
+	 * actually programmed on the hw, not the proposed crtc */
+	struct drm_crtc *crtc;
+
 	struct drm_display_mode mode;
 	int last_dpms;
 
 	struct nv04_output_reg restore;
 
-	void (*disconnect)(struct nouveau_encoder *encoder);
-
 	union {
 		struct {
 			int mc_unknown;
@@ -53,6 +55,7 @@ struct nouveau_encoder {
 			int dpcd_version;
 			int link_nr;
 			int link_bw;
+			bool enhanced_frame;
 		} dp;
 	};
 };
@@ -69,10 +72,16 @@ static inline struct drm_encoder *to_drm_encoder(struct nouveau_encoder *enc)
 	return &enc->base.base;
 }
 
+static inline struct drm_encoder_slave_funcs *
+get_slave_funcs(struct drm_encoder *enc)
+{
+	return to_encoder_slave(enc)->slave_funcs;
+}
+
 struct nouveau_connector *
 nouveau_encoder_connector_get(struct nouveau_encoder *encoder);
-int nv50_sor_create(struct drm_device *dev, struct dcb_entry *entry);
-int nv50_dac_create(struct drm_device *dev, struct dcb_entry *entry);
+int nv50_sor_create(struct drm_connector *, struct dcb_entry *);
+int nv50_dac_create(struct drm_connector *, struct dcb_entry *);
 
 struct bit_displayport_encoder_table {
 	uint32_t match;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 257ea13..11f13fc 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -280,6 +280,8 @@ nouveau_fbcon_create(struct nouveau_fbdev *nfbdev,
 
 	if (dev_priv->channel && !nouveau_nofbaccel) {
 		switch (dev_priv->card_type) {
+		case NV_C0:
+			break;
 		case NV_50:
 			nv50_fbcon_accel_init(info);
 			info->fbops = &nv50_fbcon_ops;
@@ -333,7 +335,7 @@ nouveau_fbcon_output_poll_changed(struct drm_device *dev)
 	drm_fb_helper_hotplug_event(&dev_priv->nfbdev->helper);
 }
 
-int
+static int
 nouveau_fbcon_destroy(struct drm_device *dev, struct nouveau_fbdev *nfbdev)
 {
 	struct nouveau_framebuffer *nouveau_fb = &nfbdev->nouveau_fb;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index faddf53..6b208ff 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -67,12 +67,13 @@ nouveau_fence_update(struct nouveau_channel *chan)
 	if (USE_REFCNT)
 		sequence = nvchan_rd32(chan, 0x48);
 	else
-		sequence = chan->fence.last_sequence_irq;
+		sequence = atomic_read(&chan->fence.last_sequence_irq);
 
 	if (chan->fence.sequence_ack == sequence)
 		return;
 	chan->fence.sequence_ack = sequence;
 
+	spin_lock(&chan->fence.lock);
 	list_for_each_safe(entry, tmp, &chan->fence.pending) {
 		fence = list_entry(entry, struct nouveau_fence, entry);
 
@@ -84,6 +85,7 @@ nouveau_fence_update(struct nouveau_channel *chan)
 		if (sequence == chan->fence.sequence_ack)
 			break;
 	}
+	spin_unlock(&chan->fence.lock);
 }
 
 int
@@ -119,7 +121,6 @@ nouveau_fence_emit(struct nouveau_fence *fence)
 {
 	struct drm_nouveau_private *dev_priv = fence->channel->dev->dev_private;
 	struct nouveau_channel *chan = fence->channel;
-	unsigned long flags;
 	int ret;
 
 	ret = RING_SPACE(chan, 2);
@@ -127,9 +128,7 @@ nouveau_fence_emit(struct nouveau_fence *fence)
 		return ret;
 
 	if (unlikely(chan->fence.sequence == chan->fence.sequence_ack - 1)) {
-		spin_lock_irqsave(&chan->fence.lock, flags);
 		nouveau_fence_update(chan);
-		spin_unlock_irqrestore(&chan->fence.lock, flags);
 
 		BUG_ON(chan->fence.sequence ==
 		       chan->fence.sequence_ack - 1);
@@ -138,9 +137,9 @@ nouveau_fence_emit(struct nouveau_fence *fence)
 	fence->sequence = ++chan->fence.sequence;
 
 	kref_get(&fence->refcount);
-	spin_lock_irqsave(&chan->fence.lock, flags);
+	spin_lock(&chan->fence.lock);
 	list_add_tail(&fence->entry, &chan->fence.pending);
-	spin_unlock_irqrestore(&chan->fence.lock, flags);
+	spin_unlock(&chan->fence.lock);
 
 	BEGIN_RING(chan, NvSubSw, USE_REFCNT ? 0x0050 : 0x0150, 1);
 	OUT_RING(chan, fence->sequence);
@@ -173,14 +172,11 @@ nouveau_fence_signalled(void *sync_obj, void *sync_arg)
 {
 	struct nouveau_fence *fence = nouveau_fence(sync_obj);
 	struct nouveau_channel *chan = fence->channel;
-	unsigned long flags;
 
 	if (fence->signalled)
 		return true;
 
-	spin_lock_irqsave(&chan->fence.lock, flags);
 	nouveau_fence_update(chan);
-	spin_unlock_irqrestore(&chan->fence.lock, flags);
 	return fence->signalled;
 }
 
@@ -190,8 +186,6 @@ nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
 	unsigned long timeout = jiffies + (3 * DRM_HZ);
 	int ret = 0;
 
-	__set_current_state(intr ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
-
 	while (1) {
 		if (nouveau_fence_signalled(sync_obj, sync_arg))
 			break;
@@ -201,6 +195,8 @@ nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
 			break;
 		}
 
+		__set_current_state(intr ? TASK_INTERRUPTIBLE
+			: TASK_UNINTERRUPTIBLE);
 		if (lazy)
 			schedule_timeout(1);
 
@@ -221,27 +217,12 @@ nouveau_fence_flush(void *sync_obj, void *sync_arg)
 	return 0;
 }
 
-void
-nouveau_fence_handler(struct drm_device *dev, int channel)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_channel *chan = NULL;
-
-	if (channel >= 0 && channel < dev_priv->engine.fifo.channels)
-		chan = dev_priv->fifos[channel];
-
-	if (chan) {
-		spin_lock_irq(&chan->fence.lock);
-		nouveau_fence_update(chan);
-		spin_unlock_irq(&chan->fence.lock);
-	}
-}
-
 int
 nouveau_fence_init(struct nouveau_channel *chan)
 {
 	INIT_LIST_HEAD(&chan->fence.pending);
 	spin_lock_init(&chan->fence.lock);
+	atomic_set(&chan->fence.last_sequence_irq, 0);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 6937d53..613f878 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -137,8 +137,6 @@ nouveau_gem_ioctl_new(struct drm_device *dev, void *data,
 	uint32_t flags = 0;
 	int ret = 0;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	if (unlikely(dev_priv->ttm.bdev.dev_mapping == NULL))
 		dev_priv->ttm.bdev.dev_mapping = dev_priv->dev->dev_mapping;
 
@@ -578,10 +576,9 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 	struct drm_nouveau_gem_pushbuf_bo *bo;
 	struct nouveau_channel *chan;
 	struct validate_op op;
-	struct nouveau_fence *fence = 0;
+	struct nouveau_fence *fence = NULL;
 	int i, j, ret = 0, do_reloc = 0;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
 	NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(req->channel, file_priv, chan);
 
 	req->vram_available = dev_priv->fb_aper_free;
@@ -666,7 +663,7 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 				      push[i].length);
 		}
 	} else
-	if (dev_priv->card_type >= NV_20) {
+	if (dev_priv->chipset >= 0x25) {
 		ret = RING_SPACE(chan, req->nr_push * 2);
 		if (ret) {
 			NV_ERROR(dev, "cal_space: %d\n", ret);
@@ -741,7 +738,7 @@ out_next:
 		req->suffix0 = 0x00000000;
 		req->suffix1 = 0x00000000;
 	} else
-	if (dev_priv->card_type >= NV_20) {
+	if (dev_priv->chipset >= 0x25) {
 		req->suffix0 = 0x00020000;
 		req->suffix1 = 0x00000000;
 	} else {
@@ -776,8 +773,6 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
 	bool no_wait = !!(req->flags & NOUVEAU_GEM_CPU_PREP_NOWAIT);
 	int ret = -EINVAL;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	gem = drm_gem_object_lookup(dev, file_priv, req->handle);
 	if (!gem)
 		return ret;
@@ -816,8 +811,6 @@ nouveau_gem_ioctl_cpu_fini(struct drm_device *dev, void *data,
 	struct nouveau_bo *nvbo;
 	int ret = -EINVAL;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	gem = drm_gem_object_lookup(dev, file_priv, req->handle);
 	if (!gem)
 		return ret;
@@ -843,8 +836,6 @@ nouveau_gem_ioctl_info(struct drm_device *dev, void *data,
 	struct drm_gem_object *gem;
 	int ret;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	gem = drm_gem_object_lookup(dev, file_priv, req->handle);
 	if (!gem)
 		return -EINVAL;
diff --git a/drivers/gpu/drm/nouveau/nouveau_grctx.c b/drivers/gpu/drm/nouveau/nouveau_grctx.c
deleted file mode 100644
index f731c5f..0000000
--- a/drivers/gpu/drm/nouveau/nouveau_grctx.c
+++ /dev/null
@@ -1,160 +0,0 @@
-/*
- * Copyright 2009 Red Hat Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: Ben Skeggs
- */
-
-#include <linux/firmware.h>
-#include <linux/slab.h>
-
-#include "drmP.h"
-#include "nouveau_drv.h"
-
-struct nouveau_ctxprog {
-	uint32_t signature;
-	uint8_t  version;
-	uint16_t length;
-	uint32_t data[];
-} __attribute__ ((packed));
-
-struct nouveau_ctxvals {
-	uint32_t signature;
-	uint8_t  version;
-	uint32_t length;
-	struct {
-		uint32_t offset;
-		uint32_t value;
-	} data[];
-} __attribute__ ((packed));
-
-int
-nouveau_grctx_prog_load(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
-	const int chipset = dev_priv->chipset;
-	const struct firmware *fw;
-	const struct nouveau_ctxprog *cp;
-	const struct nouveau_ctxvals *cv;
-	char name[32];
-	int ret, i;
-
-	if (pgraph->accel_blocked)
-		return -ENODEV;
-
-	if (!pgraph->ctxprog) {
-		sprintf(name, "nouveau/nv%02x.ctxprog", chipset);
-		ret = request_firmware(&fw, name, &dev->pdev->dev);
-		if (ret) {
-			NV_ERROR(dev, "No ctxprog for NV%02x\n", chipset);
-			return ret;
-		}
-
-		pgraph->ctxprog = kmemdup(fw->data, fw->size, GFP_KERNEL);
-		if (!pgraph->ctxprog) {
-			NV_ERROR(dev, "OOM copying ctxprog\n");
-			release_firmware(fw);
-			return -ENOMEM;
-		}
-
-		cp = pgraph->ctxprog;
-		if (le32_to_cpu(cp->signature) != 0x5043564e ||
-		    cp->version != 0 ||
-		    le16_to_cpu(cp->length) != ((fw->size - 7) / 4)) {
-			NV_ERROR(dev, "ctxprog invalid\n");
-			release_firmware(fw);
-			nouveau_grctx_fini(dev);
-			return -EINVAL;
-		}
-		release_firmware(fw);
-	}
-
-	if (!pgraph->ctxvals) {
-		sprintf(name, "nouveau/nv%02x.ctxvals", chipset);
-		ret = request_firmware(&fw, name, &dev->pdev->dev);
-		if (ret) {
-			NV_ERROR(dev, "No ctxvals for NV%02x\n", chipset);
-			nouveau_grctx_fini(dev);
-			return ret;
-		}
-
-		pgraph->ctxvals = kmemdup(fw->data, fw->size, GFP_KERNEL);
-		if (!pgraph->ctxvals) {
-			NV_ERROR(dev, "OOM copying ctxvals\n");
-			release_firmware(fw);
-			nouveau_grctx_fini(dev);
-			return -ENOMEM;
-		}
-
-		cv = (void *)pgraph->ctxvals;
-		if (le32_to_cpu(cv->signature) != 0x5643564e ||
-		    cv->version != 0 ||
-		    le32_to_cpu(cv->length) != ((fw->size - 9) / 8)) {
-			NV_ERROR(dev, "ctxvals invalid\n");
-			release_firmware(fw);
-			nouveau_grctx_fini(dev);
-			return -EINVAL;
-		}
-		release_firmware(fw);
-	}
-
-	cp = pgraph->ctxprog;
-
-	nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_INDEX, 0);
-	for (i = 0; i < le16_to_cpu(cp->length); i++)
-		nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_DATA,
-			le32_to_cpu(cp->data[i]));
-
-	return 0;
-}
-
-void
-nouveau_grctx_fini(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
-
-	if (pgraph->ctxprog) {
-		kfree(pgraph->ctxprog);
-		pgraph->ctxprog = NULL;
-	}
-
-	if (pgraph->ctxvals) {
-		kfree(pgraph->ctxprog);
-		pgraph->ctxvals = NULL;
-	}
-}
-
-void
-nouveau_grctx_vals_load(struct drm_device *dev, struct nouveau_gpuobj *ctx)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
-	struct nouveau_ctxvals *cv = pgraph->ctxvals;
-	int i;
-
-	if (!cv)
-		return;
-
-	for (i = 0; i < le32_to_cpu(cv->length); i++)
-		nv_wo32(dev, ctx, le32_to_cpu(cv->data[i].offset),
-			le32_to_cpu(cv->data[i].value));
-}
diff --git a/drivers/gpu/drm/nouveau/nouveau_grctx.h b/drivers/gpu/drm/nouveau/nouveau_grctx.h
index 5d39c4c..4a8ad13 100644
--- a/drivers/gpu/drm/nouveau/nouveau_grctx.h
+++ b/drivers/gpu/drm/nouveau/nouveau_grctx.h
@@ -126,7 +126,7 @@ gr_def(struct nouveau_grctx *ctx, uint32_t reg, uint32_t val)
 	reg = (reg - 0x00400000) / 4;
 	reg = (reg - ctx->ctxprog_reg) + ctx->ctxvals_base;
 
-	nv_wo32(ctx->dev, ctx->data, reg, val);
+	nv_wo32(ctx->data, reg * 4, val);
 }
 #endif
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_hw.c b/drivers/gpu/drm/nouveau/nouveau_hw.c
index 7855b35..cb13134 100644
--- a/drivers/gpu/drm/nouveau/nouveau_hw.c
+++ b/drivers/gpu/drm/nouveau/nouveau_hw.c
@@ -305,7 +305,7 @@ setPLL_double_lowregs(struct drm_device *dev, uint32_t NMNMreg,
 	bool mpll = Preg == 0x4020;
 	uint32_t oldPval = nvReadMC(dev, Preg);
 	uint32_t NMNM = pv->NM2 << 16 | pv->NM1;
-	uint32_t Pval = (oldPval & (mpll ? ~(0x11 << 16) : ~(1 << 16))) |
+	uint32_t Pval = (oldPval & (mpll ? ~(0x77 << 16) : ~(7 << 16))) |
 			0xc << 28 | pv->log2P << 16;
 	uint32_t saved4600 = 0;
 	/* some cards have different maskc040s */
@@ -865,8 +865,13 @@ nv_save_state_ext(struct drm_device *dev, int head,
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_FF_INDEX);
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_FFLWM__INDEX);
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_21);
-	if (dev_priv->card_type >= NV_30)
+
+	if (dev_priv->card_type >= NV_20)
 		rd_cio_state(dev, head, regp, NV_CIO_CRE_47);
+
+	if (dev_priv->card_type >= NV_30)
+		rd_cio_state(dev, head, regp, 0x9f);
+
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_49);
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_HCUR_ADDR0_INDEX);
 	rd_cio_state(dev, head, regp, NV_CIO_CRE_HCUR_ADDR1_INDEX);
@@ -971,9 +976,13 @@ nv_load_state_ext(struct drm_device *dev, int head,
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_ENH_INDEX);
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_FF_INDEX);
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_FFLWM__INDEX);
-	if (dev_priv->card_type >= NV_30)
+
+	if (dev_priv->card_type >= NV_20)
 		wr_cio_state(dev, head, regp, NV_CIO_CRE_47);
 
+	if (dev_priv->card_type >= NV_30)
+		wr_cio_state(dev, head, regp, 0x9f);
+
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_49);
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_HCUR_ADDR0_INDEX);
 	wr_cio_state(dev, head, regp, NV_CIO_CRE_HCUR_ADDR1_INDEX);
diff --git a/drivers/gpu/drm/nouveau/nouveau_i2c.c b/drivers/gpu/drm/nouveau/nouveau_i2c.c
index 316a3c7..8461485 100644
--- a/drivers/gpu/drm/nouveau/nouveau_i2c.c
+++ b/drivers/gpu/drm/nouveau/nouveau_i2c.c
@@ -163,7 +163,7 @@ nouveau_i2c_init(struct drm_device *dev, struct dcb_i2c_entry *entry, int index)
 	if (entry->chan)
 		return -EEXIST;
 
-	if (dev_priv->card_type == NV_50 && entry->read >= NV50_I2C_PORTS) {
+	if (dev_priv->card_type >= NV_50 && entry->read >= NV50_I2C_PORTS) {
 		NV_ERROR(dev, "unknown i2c port %d\n", entry->read);
 		return -EINVAL;
 	}
@@ -174,26 +174,26 @@ nouveau_i2c_init(struct drm_device *dev, struct dcb_i2c_entry *entry, int index)
 
 	switch (entry->port_type) {
 	case 0:
-		i2c->algo.bit.setsda = nv04_i2c_setsda;
-		i2c->algo.bit.setscl = nv04_i2c_setscl;
-		i2c->algo.bit.getsda = nv04_i2c_getsda;
-		i2c->algo.bit.getscl = nv04_i2c_getscl;
+		i2c->bit.setsda = nv04_i2c_setsda;
+		i2c->bit.setscl = nv04_i2c_setscl;
+		i2c->bit.getsda = nv04_i2c_getsda;
+		i2c->bit.getscl = nv04_i2c_getscl;
 		i2c->rd = entry->read;
 		i2c->wr = entry->write;
 		break;
 	case 4:
-		i2c->algo.bit.setsda = nv4e_i2c_setsda;
-		i2c->algo.bit.setscl = nv4e_i2c_setscl;
-		i2c->algo.bit.getsda = nv4e_i2c_getsda;
-		i2c->algo.bit.getscl = nv4e_i2c_getscl;
+		i2c->bit.setsda = nv4e_i2c_setsda;
+		i2c->bit.setscl = nv4e_i2c_setscl;
+		i2c->bit.getsda = nv4e_i2c_getsda;
+		i2c->bit.getscl = nv4e_i2c_getscl;
 		i2c->rd = 0x600800 + entry->read;
 		i2c->wr = 0x600800 + entry->write;
 		break;
 	case 5:
-		i2c->algo.bit.setsda = nv50_i2c_setsda;
-		i2c->algo.bit.setscl = nv50_i2c_setscl;
-		i2c->algo.bit.getsda = nv50_i2c_getsda;
-		i2c->algo.bit.getscl = nv50_i2c_getscl;
+		i2c->bit.setsda = nv50_i2c_setsda;
+		i2c->bit.setscl = nv50_i2c_setscl;
+		i2c->bit.getsda = nv50_i2c_getsda;
+		i2c->bit.getscl = nv50_i2c_getscl;
 		i2c->rd = nv50_i2c_port[entry->read];
 		i2c->wr = i2c->rd;
 		break;
@@ -216,17 +216,14 @@ nouveau_i2c_init(struct drm_device *dev, struct dcb_i2c_entry *entry, int index)
 	i2c_set_adapdata(&i2c->adapter, i2c);
 
 	if (entry->port_type < 6) {
-		i2c->adapter.algo_data = &i2c->algo.bit;
-		i2c->algo.bit.udelay = 40;
-		i2c->algo.bit.timeout = usecs_to_jiffies(5000);
-		i2c->algo.bit.data = i2c;
+		i2c->adapter.algo_data = &i2c->bit;
+		i2c->bit.udelay = 40;
+		i2c->bit.timeout = usecs_to_jiffies(5000);
+		i2c->bit.data = i2c;
 		ret = i2c_bit_add_bus(&i2c->adapter);
 	} else {
-		i2c->adapter.algo_data = &i2c->algo.dp;
-		i2c->algo.dp.running = false;
-		i2c->algo.dp.address = 0;
-		i2c->algo.dp.aux_ch = nouveau_dp_i2c_aux_ch;
-		ret = i2c_dp_aux_add_bus(&i2c->adapter);
+		i2c->adapter.algo = &nouveau_dp_i2c_algo;
+		ret = i2c_add_adapter(&i2c->adapter);
 	}
 
 	if (ret) {
@@ -278,3 +275,45 @@ nouveau_i2c_find(struct drm_device *dev, int index)
 	return i2c->chan;
 }
 
+bool
+nouveau_probe_i2c_addr(struct nouveau_i2c_chan *i2c, int addr)
+{
+	uint8_t buf[] = { 0 };
+	struct i2c_msg msgs[] = {
+		{
+			.addr = addr,
+			.flags = 0,
+			.len = 1,
+			.buf = buf,
+		},
+		{
+			.addr = addr,
+			.flags = I2C_M_RD,
+			.len = 1,
+			.buf = buf,
+		}
+	};
+
+	return i2c_transfer(&i2c->adapter, msgs, 2) == 2;
+}
+
+int
+nouveau_i2c_identify(struct drm_device *dev, const char *what,
+		     struct i2c_board_info *info, int index)
+{
+	struct nouveau_i2c_chan *i2c = nouveau_i2c_find(dev, index);
+	int i;
+
+	NV_DEBUG(dev, "Probing %ss on I2C bus: %d\n", what, index);
+
+	for (i = 0; info[i].addr; i++) {
+		if (nouveau_probe_i2c_addr(i2c, info[i].addr)) {
+			NV_INFO(dev, "Detected %s: %s\n", what, info[i].type);
+			return i;
+		}
+	}
+
+	NV_DEBUG(dev, "No devices found.\n");
+
+	return -ENODEV;
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_i2c.h b/drivers/gpu/drm/nouveau/nouveau_i2c.h
index c8eaf7a..f71cb32 100644
--- a/drivers/gpu/drm/nouveau/nouveau_i2c.h
+++ b/drivers/gpu/drm/nouveau/nouveau_i2c.h
@@ -33,10 +33,7 @@ struct dcb_i2c_entry;
 struct nouveau_i2c_chan {
 	struct i2c_adapter adapter;
 	struct drm_device *dev;
-	union {
-		struct i2c_algo_bit_data bit;
-		struct i2c_algo_dp_aux_data dp;
-	} algo;
+	struct i2c_algo_bit_data bit;
 	unsigned rd;
 	unsigned wr;
 	unsigned data;
@@ -45,8 +42,10 @@ struct nouveau_i2c_chan {
 int nouveau_i2c_init(struct drm_device *, struct dcb_i2c_entry *, int index);
 void nouveau_i2c_fini(struct drm_device *, struct dcb_i2c_entry *);
 struct nouveau_i2c_chan *nouveau_i2c_find(struct drm_device *, int index);
+bool nouveau_probe_i2c_addr(struct nouveau_i2c_chan *i2c, int addr);
+int nouveau_i2c_identify(struct drm_device *dev, const char *what,
+			 struct i2c_board_info *info, int index);
 
-int nouveau_dp_i2c_aux_ch(struct i2c_adapter *, int mode, uint8_t write_byte,
-			  uint8_t *read_byte);
+extern const struct i2c_algorithm nouveau_dp_i2c_algo;
 
 #endif /* __NOUVEAU_I2C_H__ */
diff --git a/drivers/gpu/drm/nouveau/nouveau_irq.c b/drivers/gpu/drm/nouveau/nouveau_irq.c
index 53360f1..6fd51a5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_irq.c
+++ b/drivers/gpu/drm/nouveau/nouveau_irq.c
@@ -35,6 +35,7 @@
 #include "nouveau_drm.h"
 #include "nouveau_drv.h"
 #include "nouveau_reg.h"
+#include "nouveau_ramht.h"
 #include <linux/ratelimit.h>
 
 /* needed for hotplug irq */
@@ -49,7 +50,7 @@ nouveau_irq_preinstall(struct drm_device *dev)
 	/* Master disable */
 	nv_wr32(dev, NV03_PMC_INTR_EN_0, 0);
 
-	if (dev_priv->card_type == NV_50) {
+	if (dev_priv->card_type >= NV_50) {
 		INIT_WORK(&dev_priv->irq_work, nv50_display_irq_handler_bh);
 		INIT_WORK(&dev_priv->hpd_work, nv50_display_irq_hotplug_bh);
 		INIT_LIST_HEAD(&dev_priv->vbl_waiting);
@@ -106,15 +107,16 @@ nouveau_fifo_swmthd(struct nouveau_channel *chan, uint32_t addr, uint32_t data)
 	const int mthd = addr & 0x1ffc;
 
 	if (mthd == 0x0000) {
-		struct nouveau_gpuobj_ref *ref = NULL;
+		struct nouveau_gpuobj *gpuobj;
 
-		if (nouveau_gpuobj_ref_find(chan, data, &ref))
+		gpuobj = nouveau_ramht_find(chan, data);
+		if (!gpuobj)
 			return false;
 
-		if (ref->gpuobj->engine != NVOBJ_ENGINE_SW)
+		if (gpuobj->engine != NVOBJ_ENGINE_SW)
 			return false;
 
-		chan->sw_subchannel[subc] = ref->gpuobj->class;
+		chan->sw_subchannel[subc] = gpuobj->class;
 		nv_wr32(dev, NV04_PFIFO_CACHE1_ENGINE, nv_rd32(dev,
 			NV04_PFIFO_CACHE1_ENGINE) & ~(0xf << subc * 4));
 		return true;
@@ -200,16 +202,45 @@ nouveau_fifo_irq_handler(struct drm_device *dev)
 		}
 
 		if (status & NV_PFIFO_INTR_DMA_PUSHER) {
-			NV_INFO(dev, "PFIFO_DMA_PUSHER - Ch %d\n", chid);
+			u32 get = nv_rd32(dev, 0x003244);
+			u32 put = nv_rd32(dev, 0x003240);
+			u32 push = nv_rd32(dev, 0x003220);
+			u32 state = nv_rd32(dev, 0x003228);
+
+			if (dev_priv->card_type == NV_50) {
+				u32 ho_get = nv_rd32(dev, 0x003328);
+				u32 ho_put = nv_rd32(dev, 0x003320);
+				u32 ib_get = nv_rd32(dev, 0x003334);
+				u32 ib_put = nv_rd32(dev, 0x003330);
+
+				NV_INFO(dev, "PFIFO_DMA_PUSHER - Ch %d Get 0x%02x%08x "
+					     "Put 0x%02x%08x IbGet 0x%08x IbPut 0x%08x "
+					     "State 0x%08x Push 0x%08x\n",
+					chid, ho_get, get, ho_put, put, ib_get, ib_put,
+					state, push);
+
+				/* METHOD_COUNT, in DMA_STATE on earlier chipsets */
+				nv_wr32(dev, 0x003364, 0x00000000);
+				if (get != put || ho_get != ho_put) {
+					nv_wr32(dev, 0x003244, put);
+					nv_wr32(dev, 0x003328, ho_put);
+				} else
+				if (ib_get != ib_put) {
+					nv_wr32(dev, 0x003334, ib_put);
+				}
+			} else {
+				NV_INFO(dev, "PFIFO_DMA_PUSHER - Ch %d Get 0x%08x "
+					     "Put 0x%08x State 0x%08x Push 0x%08x\n",
+					chid, get, put, state, push);
 
-			status &= ~NV_PFIFO_INTR_DMA_PUSHER;
-			nv_wr32(dev, NV03_PFIFO_INTR_0,
-						NV_PFIFO_INTR_DMA_PUSHER);
+				if (get != put)
+					nv_wr32(dev, 0x003244, put);
+			}
 
-			nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_STATE, 0x00000000);
-			if (nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_PUT) != get)
-				nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_GET,
-								get + 4);
+			nv_wr32(dev, 0x003228, 0x00000000);
+			nv_wr32(dev, 0x003220, 0x00000001);
+			nv_wr32(dev, 0x002100, NV_PFIFO_INTR_DMA_PUSHER);
+			status &= ~NV_PFIFO_INTR_DMA_PUSHER;
 		}
 
 		if (status & NV_PFIFO_INTR_SEMAPHORE) {
@@ -226,6 +257,14 @@ nouveau_fifo_irq_handler(struct drm_device *dev)
 			nv_wr32(dev, NV04_PFIFO_CACHE1_PULL0, 1);
 		}
 
+		if (dev_priv->card_type == NV_50) {
+			if (status & 0x00000010) {
+				nv50_fb_vm_trap(dev, 1, "PFIFO_BAR_FAULT");
+				status &= ~0x00000010;
+				nv_wr32(dev, 0x002100, 0x00000010);
+			}
+		}
+
 		if (status) {
 			NV_INFO(dev, "PFIFO_INTR 0x%08x - Ch %d\n",
 				status, chid);
@@ -357,7 +396,7 @@ nouveau_graph_chid_from_grctx(struct drm_device *dev)
 			if (!chan || !chan->ramin_grctx)
 				continue;
 
-			if (inst == chan->ramin_grctx->instance)
+			if (inst == chan->ramin_grctx->pinst)
 				break;
 		}
 	} else {
@@ -369,7 +408,7 @@ nouveau_graph_chid_from_grctx(struct drm_device *dev)
 			if (!chan || !chan->ramin)
 				continue;
 
-			if (inst == chan->ramin->instance)
+			if (inst == chan->ramin->vinst)
 				break;
 		}
 	}
@@ -586,11 +625,11 @@ nouveau_pgraph_irq_handler(struct drm_device *dev)
 		}
 
 		if (status & NV_PGRAPH_INTR_CONTEXT_SWITCH) {
-			nouveau_pgraph_intr_context_switch(dev);
-
 			status &= ~NV_PGRAPH_INTR_CONTEXT_SWITCH;
 			nv_wr32(dev, NV03_PGRAPH_INTR,
 				 NV_PGRAPH_INTR_CONTEXT_SWITCH);
+
+			nouveau_pgraph_intr_context_switch(dev);
 		}
 
 		if (status) {
@@ -605,40 +644,6 @@ nouveau_pgraph_irq_handler(struct drm_device *dev)
 	nv_wr32(dev, NV03_PMC_INTR_0, NV_PMC_INTR_0_PGRAPH_PENDING);
 }
 
-static void
-nv50_pfb_vm_trap(struct drm_device *dev, int display, const char *name)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t trap[6];
-	int i, ch;
-	uint32_t idx = nv_rd32(dev, 0x100c90);
-	if (idx & 0x80000000) {
-		idx &= 0xffffff;
-		if (display) {
-			for (i = 0; i < 6; i++) {
-				nv_wr32(dev, 0x100c90, idx | i << 24);
-				trap[i] = nv_rd32(dev, 0x100c94);
-			}
-			for (ch = 0; ch < dev_priv->engine.fifo.channels; ch++) {
-				struct nouveau_channel *chan = dev_priv->fifos[ch];
-
-				if (!chan || !chan->ramin)
-					continue;
-
-				if (trap[1] == chan->ramin->instance >> 12)
-					break;
-			}
-			NV_INFO(dev, "%s - VM: Trapped %s at %02x%04x%04x status %08x %08x channel %d\n",
-					name, (trap[5]&0x100?"read":"write"),
-					trap[5]&0xff, trap[4]&0xffff,
-					trap[3]&0xffff, trap[0], trap[2], ch);
-		}
-		nv_wr32(dev, 0x100c90, idx | 0x80000000);
-	} else if (display) {
-		NV_INFO(dev, "%s - no VM fault?\n", name);
-	}
-}
-
 static struct nouveau_enum_names nv50_mp_exec_error_names[] =
 {
 	{ 3, "STACK_UNDERFLOW" },
@@ -711,7 +716,7 @@ nv50_pgraph_tp_trap(struct drm_device *dev, int type, uint32_t ustatus_old,
 		tps++;
 		switch (type) {
 		case 6: /* texture error... unknown for now */
-			nv50_pfb_vm_trap(dev, display, name);
+			nv50_fb_vm_trap(dev, display, name);
 			if (display) {
 				NV_ERROR(dev, "magic set %d:\n", i);
 				for (r = ustatus_addr + 4; r <= ustatus_addr + 0x10; r += 4)
@@ -734,7 +739,7 @@ nv50_pgraph_tp_trap(struct drm_device *dev, int type, uint32_t ustatus_old,
 			uint32_t e1c = nv_rd32(dev, ustatus_addr + 0x14);
 			uint32_t e20 = nv_rd32(dev, ustatus_addr + 0x18);
 			uint32_t e24 = nv_rd32(dev, ustatus_addr + 0x1c);
-			nv50_pfb_vm_trap(dev, display, name);
+			nv50_fb_vm_trap(dev, display, name);
 			/* 2d engine destination */
 			if (ustatus & 0x00000010) {
 				if (display) {
@@ -817,7 +822,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 
 		/* Known to be triggered by screwed up NOTIFY and COND... */
 		if (ustatus & 0x00000001) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_DISPATCH_FAULT");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_DISPATCH_FAULT");
 			nv_wr32(dev, 0x400500, 0);
 			if (nv_rd32(dev, 0x400808) & 0x80000000) {
 				if (display) {
@@ -842,7 +847,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 			ustatus &= ~0x00000001;
 		}
 		if (ustatus & 0x00000002) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_DISPATCH_QUERY");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_DISPATCH_QUERY");
 			nv_wr32(dev, 0x400500, 0);
 			if (nv_rd32(dev, 0x40084c) & 0x80000000) {
 				if (display) {
@@ -884,15 +889,15 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 			NV_INFO(dev, "PGRAPH_TRAP_M2MF - no ustatus?\n");
 		}
 		if (ustatus & 0x00000001) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_NOTIFY");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_NOTIFY");
 			ustatus &= ~0x00000001;
 		}
 		if (ustatus & 0x00000002) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_IN");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_IN");
 			ustatus &= ~0x00000002;
 		}
 		if (ustatus & 0x00000004) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_OUT");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_M2MF_OUT");
 			ustatus &= ~0x00000004;
 		}
 		NV_INFO (dev, "PGRAPH_TRAP_M2MF - %08x %08x %08x %08x\n",
@@ -917,7 +922,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 			NV_INFO(dev, "PGRAPH_TRAP_VFETCH - no ustatus?\n");
 		}
 		if (ustatus & 0x00000001) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_VFETCH_FAULT");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_VFETCH_FAULT");
 			NV_INFO (dev, "PGRAPH_TRAP_VFETCH_FAULT - %08x %08x %08x %08x\n",
 					nv_rd32(dev, 0x400c00),
 					nv_rd32(dev, 0x400c08),
@@ -939,7 +944,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 			NV_INFO(dev, "PGRAPH_TRAP_STRMOUT - no ustatus?\n");
 		}
 		if (ustatus & 0x00000001) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_STRMOUT_FAULT");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_STRMOUT_FAULT");
 			NV_INFO (dev, "PGRAPH_TRAP_STRMOUT_FAULT - %08x %08x %08x %08x\n",
 					nv_rd32(dev, 0x401804),
 					nv_rd32(dev, 0x401808),
@@ -964,7 +969,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 			NV_INFO(dev, "PGRAPH_TRAP_CCACHE - no ustatus?\n");
 		}
 		if (ustatus & 0x00000001) {
-			nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_CCACHE_FAULT");
+			nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_CCACHE_FAULT");
 			NV_INFO (dev, "PGRAPH_TRAP_CCACHE_FAULT - %08x %08x %08x %08x %08x %08x %08x\n",
 					nv_rd32(dev, 0x405800),
 					nv_rd32(dev, 0x405804),
@@ -986,7 +991,7 @@ nv50_pgraph_trap_handler(struct drm_device *dev)
 	 * remaining, so try to handle it anyway. Perhaps related to that
 	 * unknown DMA slot on tesla? */
 	if (status & 0x20) {
-		nv50_pfb_vm_trap(dev, display, "PGRAPH_TRAP_UNKC04");
+		nv50_fb_vm_trap(dev, display, "PGRAPH_TRAP_UNKC04");
 		ustatus = nv_rd32(dev, 0x402000) & 0x7fffffff;
 		if (display)
 			NV_INFO(dev, "PGRAPH_TRAP_UNKC04 - Unhandled ustatus 0x%08x\n", ustatus);
diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c b/drivers/gpu/drm/nouveau/nouveau_mem.c
index c1fd42b..4f0ae39 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.c
@@ -35,162 +35,6 @@
 #include "drm_sarea.h"
 #include "nouveau_drv.h"
 
-static struct mem_block *
-split_block(struct mem_block *p, uint64_t start, uint64_t size,
-	    struct drm_file *file_priv)
-{
-	/* Maybe cut off the start of an existing block */
-	if (start > p->start) {
-		struct mem_block *newblock =
-			kmalloc(sizeof(*newblock), GFP_KERNEL);
-		if (!newblock)
-			goto out;
-		newblock->start = start;
-		newblock->size = p->size - (start - p->start);
-		newblock->file_priv = NULL;
-		newblock->next = p->next;
-		newblock->prev = p;
-		p->next->prev = newblock;
-		p->next = newblock;
-		p->size -= newblock->size;
-		p = newblock;
-	}
-
-	/* Maybe cut off the end of an existing block */
-	if (size < p->size) {
-		struct mem_block *newblock =
-			kmalloc(sizeof(*newblock), GFP_KERNEL);
-		if (!newblock)
-			goto out;
-		newblock->start = start + size;
-		newblock->size = p->size - size;
-		newblock->file_priv = NULL;
-		newblock->next = p->next;
-		newblock->prev = p;
-		p->next->prev = newblock;
-		p->next = newblock;
-		p->size = size;
-	}
-
-out:
-	/* Our block is in the middle */
-	p->file_priv = file_priv;
-	return p;
-}
-
-struct mem_block *
-nouveau_mem_alloc_block(struct mem_block *heap, uint64_t size,
-			int align2, struct drm_file *file_priv, int tail)
-{
-	struct mem_block *p;
-	uint64_t mask = (1 << align2) - 1;
-
-	if (!heap)
-		return NULL;
-
-	if (tail) {
-		list_for_each_prev(p, heap) {
-			uint64_t start = ((p->start + p->size) - size) & ~mask;
-
-			if (p->file_priv == NULL && start >= p->start &&
-			    start + size <= p->start + p->size)
-				return split_block(p, start, size, file_priv);
-		}
-	} else {
-		list_for_each(p, heap) {
-			uint64_t start = (p->start + mask) & ~mask;
-
-			if (p->file_priv == NULL &&
-			    start + size <= p->start + p->size)
-				return split_block(p, start, size, file_priv);
-		}
-	}
-
-	return NULL;
-}
-
-void nouveau_mem_free_block(struct mem_block *p)
-{
-	p->file_priv = NULL;
-
-	/* Assumes a single contiguous range.  Needs a special file_priv in
-	 * 'heap' to stop it being subsumed.
-	 */
-	if (p->next->file_priv == NULL) {
-		struct mem_block *q = p->next;
-		p->size += q->size;
-		p->next = q->next;
-		p->next->prev = p;
-		kfree(q);
-	}
-
-	if (p->prev->file_priv == NULL) {
-		struct mem_block *q = p->prev;
-		q->size += p->size;
-		q->next = p->next;
-		q->next->prev = q;
-		kfree(p);
-	}
-}
-
-/* Initialize.  How to check for an uninitialized heap?
- */
-int nouveau_mem_init_heap(struct mem_block **heap, uint64_t start,
-			  uint64_t size)
-{
-	struct mem_block *blocks = kmalloc(sizeof(*blocks), GFP_KERNEL);
-
-	if (!blocks)
-		return -ENOMEM;
-
-	*heap = kmalloc(sizeof(**heap), GFP_KERNEL);
-	if (!*heap) {
-		kfree(blocks);
-		return -ENOMEM;
-	}
-
-	blocks->start = start;
-	blocks->size = size;
-	blocks->file_priv = NULL;
-	blocks->next = blocks->prev = *heap;
-
-	memset(*heap, 0, sizeof(**heap));
-	(*heap)->file_priv = (struct drm_file *) -1;
-	(*heap)->next = (*heap)->prev = blocks;
-	return 0;
-}
-
-/*
- * Free all blocks associated with the releasing file_priv
- */
-void nouveau_mem_release(struct drm_file *file_priv, struct mem_block *heap)
-{
-	struct mem_block *p;
-
-	if (!heap || !heap->next)
-		return;
-
-	list_for_each(p, heap) {
-		if (p->file_priv == file_priv)
-			p->file_priv = NULL;
-	}
-
-	/* Assumes a single contiguous range.  Needs a special file_priv in
-	 * 'heap' to stop it being subsumed.
-	 */
-	list_for_each(p, heap) {
-		while ((p->file_priv == NULL) &&
-					(p->next->file_priv == NULL) &&
-					(p->next != heap)) {
-			struct mem_block *q = p->next;
-			p->size += q->size;
-			p->next = q->next;
-			p->next->prev = p;
-			kfree(q);
-		}
-	}
-}
-
 /*
  * NV10-NV40 tiling helpers
  */
@@ -203,18 +47,14 @@ nv10_mem_set_region_tiling(struct drm_device *dev, int i, uint32_t addr,
 	struct nouveau_fifo_engine *pfifo = &dev_priv->engine.fifo;
 	struct nouveau_fb_engine *pfb = &dev_priv->engine.fb;
 	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
-	struct nouveau_tile_reg *tile = &dev_priv->tile.reg[i];
+	struct nouveau_tile_reg *tile = &dev_priv->tile[i];
 
 	tile->addr = addr;
 	tile->size = size;
 	tile->used = !!pitch;
 	nouveau_fence_unref((void **)&tile->fence);
 
-	if (!pfifo->cache_flush(dev))
-		return;
-
 	pfifo->reassign(dev, false);
-	pfifo->cache_flush(dev);
 	pfifo->cache_pull(dev, false);
 
 	nouveau_wait_for_idle(dev);
@@ -232,34 +72,36 @@ nv10_mem_set_tiling(struct drm_device *dev, uint32_t addr, uint32_t size,
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_fb_engine *pfb = &dev_priv->engine.fb;
-	struct nouveau_tile_reg *tile = dev_priv->tile.reg, *found = NULL;
-	int i;
+	struct nouveau_tile_reg *found = NULL;
+	unsigned long i, flags;
 
-	spin_lock(&dev_priv->tile.lock);
+	spin_lock_irqsave(&dev_priv->context_switch_lock, flags);
 
 	for (i = 0; i < pfb->num_tiles; i++) {
-		if (tile[i].used)
+		struct nouveau_tile_reg *tile = &dev_priv->tile[i];
+
+		if (tile->used)
 			/* Tile region in use. */
 			continue;
 
-		if (tile[i].fence &&
-		    !nouveau_fence_signalled(tile[i].fence, NULL))
+		if (tile->fence &&
+		    !nouveau_fence_signalled(tile->fence, NULL))
 			/* Pending tile region. */
 			continue;
 
-		if (max(tile[i].addr, addr) <
-		    min(tile[i].addr + tile[i].size, addr + size))
+		if (max(tile->addr, addr) <
+		    min(tile->addr + tile->size, addr + size))
 			/* Kill an intersecting tile region. */
 			nv10_mem_set_region_tiling(dev, i, 0, 0, 0);
 
 		if (pitch && !found) {
 			/* Free tile region. */
 			nv10_mem_set_region_tiling(dev, i, addr, size, pitch);
-			found = &tile[i];
+			found = tile;
 		}
 	}
 
-	spin_unlock(&dev_priv->tile.lock);
+	spin_unlock_irqrestore(&dev_priv->context_switch_lock, flags);
 
 	return found;
 }
@@ -299,7 +141,6 @@ nv50_mem_vm_bind_linear(struct drm_device *dev, uint64_t virt, uint32_t size,
 		phys |= 0x30;
 	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	while (size) {
 		unsigned offset_h = upper_32_bits(phys);
 		unsigned offset_l = lower_32_bits(phys);
@@ -326,41 +167,18 @@ nv50_mem_vm_bind_linear(struct drm_device *dev, uint64_t virt, uint32_t size,
 			virt  += (end - pte);
 
 			while (pte < end) {
-				nv_wo32(dev, pgt, pte++, offset_l);
-				nv_wo32(dev, pgt, pte++, offset_h);
+				nv_wo32(pgt, (pte * 4) + 0, offset_l);
+				nv_wo32(pgt, (pte * 4) + 4, offset_h);
+				pte += 2;
 			}
 		}
 	}
-	dev_priv->engine.instmem.finish_access(dev);
-
-	nv_wr32(dev, 0x100c80, 0x00050001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
-	}
-
-	nv_wr32(dev, 0x100c80, 0x00000001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
-	}
-
-	nv_wr32(dev, 0x100c80, 0x00040001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
-	}
-
-	nv_wr32(dev, 0x100c80, 0x00060001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
-	}
+	dev_priv->engine.instmem.flush(dev);
 
+	nv50_vm_flush(dev, 5);
+	nv50_vm_flush(dev, 0);
+	nv50_vm_flush(dev, 4);
+	nv50_vm_flush(dev, 6);
 	return 0;
 }
 
@@ -374,7 +192,6 @@ nv50_mem_vm_unbind(struct drm_device *dev, uint64_t virt, uint32_t size)
 	virt -= dev_priv->vm_vram_base;
 	pages = (size >> 16) << 1;
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	while (pages) {
 		pgt = dev_priv->vm_vram_pt[virt >> 29];
 		pte = (virt & 0x1ffe0000ULL) >> 15;
@@ -385,60 +202,24 @@ nv50_mem_vm_unbind(struct drm_device *dev, uint64_t virt, uint32_t size)
 		pages -= (end - pte);
 		virt  += (end - pte) << 15;
 
-		while (pte < end)
-			nv_wo32(dev, pgt, pte++, 0);
-	}
-	dev_priv->engine.instmem.finish_access(dev);
-
-	nv_wr32(dev, 0x100c80, 0x00050001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return;
-	}
-
-	nv_wr32(dev, 0x100c80, 0x00000001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return;
-	}
-
-	nv_wr32(dev, 0x100c80, 0x00040001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return;
+		while (pte < end) {
+			nv_wo32(pgt, (pte * 4), 0);
+			pte++;
+		}
 	}
+	dev_priv->engine.instmem.flush(dev);
 
-	nv_wr32(dev, 0x100c80, 0x00060001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-	}
+	nv50_vm_flush(dev, 5);
+	nv50_vm_flush(dev, 0);
+	nv50_vm_flush(dev, 4);
+	nv50_vm_flush(dev, 6);
 }
 
 /*
  * Cleanup everything
  */
-void nouveau_mem_takedown(struct mem_block **heap)
-{
-	struct mem_block *p;
-
-	if (!*heap)
-		return;
-
-	for (p = (*heap)->next; p != *heap;) {
-		struct mem_block *q = p;
-		p = p->next;
-		kfree(q);
-	}
-
-	kfree(*heap);
-	*heap = NULL;
-}
-
-void nouveau_mem_close(struct drm_device *dev)
+void
+nouveau_mem_vram_fini(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
@@ -449,8 +230,20 @@ void nouveau_mem_close(struct drm_device *dev)
 
 	nouveau_ttm_global_release(dev_priv);
 
-	if (drm_core_has_AGP(dev) && dev->agp &&
-	    drm_core_check_feature(dev, DRIVER_MODESET)) {
+	if (dev_priv->fb_mtrr >= 0) {
+		drm_mtrr_del(dev_priv->fb_mtrr,
+			     pci_resource_start(dev->pdev, 1),
+			     pci_resource_len(dev->pdev, 1), DRM_MTRR_WC);
+		dev_priv->fb_mtrr = -1;
+	}
+}
+
+void
+nouveau_mem_gart_fini(struct drm_device *dev)
+{
+	nouveau_sgdma_takedown(dev);
+
+	if (drm_core_has_AGP(dev) && dev->agp) {
 		struct drm_agp_mem *entry, *tempe;
 
 		/* Remove AGP resources, but leave dev->agp
@@ -469,30 +262,24 @@ void nouveau_mem_close(struct drm_device *dev)
 		dev->agp->acquired = 0;
 		dev->agp->enabled = 0;
 	}
-
-	if (dev_priv->fb_mtrr) {
-		drm_mtrr_del(dev_priv->fb_mtrr, drm_get_resource_start(dev, 1),
-			     drm_get_resource_len(dev, 1), DRM_MTRR_WC);
-		dev_priv->fb_mtrr = 0;
-	}
 }
 
 static uint32_t
 nouveau_mem_detect_nv04(struct drm_device *dev)
 {
-	uint32_t boot0 = nv_rd32(dev, NV03_BOOT_0);
+	uint32_t boot0 = nv_rd32(dev, NV04_PFB_BOOT_0);
 
 	if (boot0 & 0x00000100)
 		return (((boot0 >> 12) & 0xf) * 2 + 2) * 1024 * 1024;
 
-	switch (boot0 & NV03_BOOT_0_RAM_AMOUNT) {
-	case NV04_BOOT_0_RAM_AMOUNT_32MB:
+	switch (boot0 & NV04_PFB_BOOT_0_RAM_AMOUNT) {
+	case NV04_PFB_BOOT_0_RAM_AMOUNT_32MB:
 		return 32 * 1024 * 1024;
-	case NV04_BOOT_0_RAM_AMOUNT_16MB:
+	case NV04_PFB_BOOT_0_RAM_AMOUNT_16MB:
 		return 16 * 1024 * 1024;
-	case NV04_BOOT_0_RAM_AMOUNT_8MB:
+	case NV04_PFB_BOOT_0_RAM_AMOUNT_8MB:
 		return 8 * 1024 * 1024;
-	case NV04_BOOT_0_RAM_AMOUNT_4MB:
+	case NV04_PFB_BOOT_0_RAM_AMOUNT_4MB:
 		return 4 * 1024 * 1024;
 	}
 
@@ -525,8 +312,62 @@ nouveau_mem_detect_nforce(struct drm_device *dev)
 	return 0;
 }
 
-/* returns the amount of FB ram in bytes */
-int
+static void
+nv50_vram_preinit(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	int i, parts, colbits, rowbitsa, rowbitsb, banks;
+	u64 rowsize, predicted;
+	u32 r0, r4, rt, ru;
+
+	r0 = nv_rd32(dev, 0x100200);
+	r4 = nv_rd32(dev, 0x100204);
+	rt = nv_rd32(dev, 0x100250);
+	ru = nv_rd32(dev, 0x001540);
+	NV_DEBUG(dev, "memcfg 0x%08x 0x%08x 0x%08x 0x%08x\n", r0, r4, rt, ru);
+
+	for (i = 0, parts = 0; i < 8; i++) {
+		if (ru & (0x00010000 << i))
+			parts++;
+	}
+
+	colbits  =  (r4 & 0x0000f000) >> 12;
+	rowbitsa = ((r4 & 0x000f0000) >> 16) + 8;
+	rowbitsb = ((r4 & 0x00f00000) >> 20) + 8;
+	banks    = ((r4 & 0x01000000) ? 8 : 4);
+
+	rowsize = parts * banks * (1 << colbits) * 8;
+	predicted = rowsize << rowbitsa;
+	if (r0 & 0x00000004)
+		predicted += rowsize << rowbitsb;
+
+	if (predicted != dev_priv->vram_size) {
+		NV_WARN(dev, "memory controller reports %dMiB VRAM\n",
+			(u32)(dev_priv->vram_size >> 20));
+		NV_WARN(dev, "we calculated %dMiB VRAM\n",
+			(u32)(predicted >> 20));
+	}
+
+	dev_priv->vram_rblock_size = rowsize >> 12;
+	if (rt & 1)
+		dev_priv->vram_rblock_size *= 3;
+
+	NV_DEBUG(dev, "rblock %lld bytes\n",
+		 (u64)dev_priv->vram_rblock_size << 12);
+}
+
+static void
+nvaa_vram_preinit(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	/* To our knowledge, there's no large scale reordering of pages
+	 * that occurs on IGP chipsets.
+	 */
+	dev_priv->vram_rblock_size = 1;
+}
+
+static int
 nouveau_mem_detect(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
@@ -536,12 +377,31 @@ nouveau_mem_detect(struct drm_device *dev)
 	} else
 	if (dev_priv->flags & (NV_NFORCE | NV_NFORCE2)) {
 		dev_priv->vram_size = nouveau_mem_detect_nforce(dev);
-	} else {
-		dev_priv->vram_size  = nv_rd32(dev, NV04_FIFO_DATA);
-		dev_priv->vram_size &= NV10_FIFO_DATA_RAM_AMOUNT_MB_MASK;
-		if (dev_priv->chipset == 0xaa || dev_priv->chipset == 0xac)
+	} else
+	if (dev_priv->card_type < NV_50) {
+		dev_priv->vram_size  = nv_rd32(dev, NV04_PFB_FIFO_DATA);
+		dev_priv->vram_size &= NV10_PFB_FIFO_DATA_RAM_AMOUNT_MB_MASK;
+	} else
+	if (dev_priv->card_type < NV_C0) {
+		dev_priv->vram_size = nv_rd32(dev, NV04_PFB_FIFO_DATA);
+		dev_priv->vram_size |= (dev_priv->vram_size & 0xff) << 32;
+		dev_priv->vram_size &= 0xffffffff00ll;
+
+		switch (dev_priv->chipset) {
+		case 0xaa:
+		case 0xac:
+		case 0xaf:
 			dev_priv->vram_sys_base = nv_rd32(dev, 0x100e10);
 			dev_priv->vram_sys_base <<= 12;
+			nvaa_vram_preinit(dev);
+			break;
+		default:
+			nv50_vram_preinit(dev);
+			break;
+		}
+	} else {
+		dev_priv->vram_size  = nv_rd32(dev, 0x10f20c) << 20;
+		dev_priv->vram_size *= nv_rd32(dev, 0x121c74);
 	}
 
 	NV_INFO(dev, "Detected %dMiB VRAM\n", (int)(dev_priv->vram_size >> 20));
@@ -556,17 +416,63 @@ nouveau_mem_detect(struct drm_device *dev)
 }
 
 #if __OS_HAS_AGP
-static void nouveau_mem_reset_agp(struct drm_device *dev)
+static unsigned long
+get_agp_mode(struct drm_device *dev, unsigned long mode)
 {
-	uint32_t saved_pci_nv_1, saved_pci_nv_19, pmc_enable;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	/*
+	 * FW seems to be broken on nv18, it makes the card lock up
+	 * randomly.
+	 */
+	if (dev_priv->chipset == 0x18)
+		mode &= ~PCI_AGP_COMMAND_FW;
+
+	/*
+	 * AGP mode set in the command line.
+	 */
+	if (nouveau_agpmode > 0) {
+		bool agpv3 = mode & 0x8;
+		int rate = agpv3 ? nouveau_agpmode / 4 : nouveau_agpmode;
+
+		mode = (mode & ~0x7) | (rate & 0x7);
+	}
+
+	return mode;
+}
+#endif
+
+int
+nouveau_mem_reset_agp(struct drm_device *dev)
+{
+#if __OS_HAS_AGP
+	uint32_t saved_pci_nv_1, pmc_enable;
+	int ret;
+
+	/* First of all, disable fast writes, otherwise if it's
+	 * already enabled in the AGP bridge and we disable the card's
+	 * AGP controller we might be locking ourselves out of it. */
+	if ((nv_rd32(dev, NV04_PBUS_PCI_NV_19) |
+	     dev->agp->mode) & PCI_AGP_COMMAND_FW) {
+		struct drm_agp_info info;
+		struct drm_agp_mode mode;
+
+		ret = drm_agp_info(dev, &info);
+		if (ret)
+			return ret;
+
+		mode.mode = get_agp_mode(dev, info.mode) & ~PCI_AGP_COMMAND_FW;
+		ret = drm_agp_enable(dev, mode);
+		if (ret)
+			return ret;
+	}
 
 	saved_pci_nv_1 = nv_rd32(dev, NV04_PBUS_PCI_NV_1);
-	saved_pci_nv_19 = nv_rd32(dev, NV04_PBUS_PCI_NV_19);
 
 	/* clear busmaster bit */
 	nv_wr32(dev, NV04_PBUS_PCI_NV_1, saved_pci_nv_1 & ~0x4);
-	/* clear SBA and AGP bits */
-	nv_wr32(dev, NV04_PBUS_PCI_NV_19, saved_pci_nv_19 & 0xfffff0ff);
+	/* disable AGP */
+	nv_wr32(dev, NV04_PBUS_PCI_NV_19, 0);
 
 	/* power cycle pgraph, if enabled */
 	pmc_enable = nv_rd32(dev, NV03_PMC_ENABLE);
@@ -578,11 +484,12 @@ static void nouveau_mem_reset_agp(struct drm_device *dev)
 	}
 
 	/* and restore (gives effect of resetting AGP) */
-	nv_wr32(dev, NV04_PBUS_PCI_NV_19, saved_pci_nv_19);
 	nv_wr32(dev, NV04_PBUS_PCI_NV_1, saved_pci_nv_1);
-}
 #endif
 
+	return 0;
+}
+
 int
 nouveau_mem_init_agp(struct drm_device *dev)
 {
@@ -592,11 +499,6 @@ nouveau_mem_init_agp(struct drm_device *dev)
 	struct drm_agp_mode mode;
 	int ret;
 
-	if (nouveau_noagp)
-		return 0;
-
-	nouveau_mem_reset_agp(dev);
-
 	if (!dev->agp->acquired) {
 		ret = drm_agp_acquire(dev);
 		if (ret) {
@@ -605,6 +507,8 @@ nouveau_mem_init_agp(struct drm_device *dev)
 		}
 	}
 
+	nouveau_mem_reset_agp(dev);
+
 	ret = drm_agp_info(dev, &info);
 	if (ret) {
 		NV_ERROR(dev, "Unable to get AGP info: %d\n", ret);
@@ -612,7 +516,7 @@ nouveau_mem_init_agp(struct drm_device *dev)
 	}
 
 	/* see agp.h for the AGPSTAT_* modes available */
-	mode.mode = info.mode;
+	mode.mode = get_agp_mode(dev, info.mode);
 	ret = drm_agp_enable(dev, mode);
 	if (ret) {
 		NV_ERROR(dev, "Unable to enable AGP: %d\n", ret);
@@ -627,24 +531,27 @@ nouveau_mem_init_agp(struct drm_device *dev)
 }
 
 int
-nouveau_mem_init(struct drm_device *dev)
+nouveau_mem_vram_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct ttm_bo_device *bdev = &dev_priv->ttm.bdev;
-	int ret, dma_bits = 32;
-
-	dev_priv->fb_phys = drm_get_resource_start(dev, 1);
-	dev_priv->gart_info.type = NOUVEAU_GART_NONE;
+	int ret, dma_bits;
 
 	if (dev_priv->card_type >= NV_50 &&
 	    pci_dma_supported(dev->pdev, DMA_BIT_MASK(40)))
 		dma_bits = 40;
+	else
+		dma_bits = 32;
 
 	ret = pci_set_dma_mask(dev->pdev, DMA_BIT_MASK(dma_bits));
-	if (ret) {
-		NV_ERROR(dev, "Error setting DMA mask: %d\n", ret);
+	if (ret)
 		return ret;
-	}
+
+	ret = nouveau_mem_detect(dev);
+	if (ret)
+		return ret;
+
+	dev_priv->fb_phys = pci_resource_start(dev->pdev, 1);
 
 	ret = nouveau_ttm_global_init(dev_priv);
 	if (ret)
@@ -659,17 +566,22 @@ nouveau_mem_init(struct drm_device *dev)
 		return ret;
 	}
 
-	INIT_LIST_HEAD(&dev_priv->ttm.bo_list);
-	spin_lock_init(&dev_priv->ttm.bo_list_lock);
-	spin_lock_init(&dev_priv->tile.lock);
-
 	dev_priv->fb_available_size = dev_priv->vram_size;
 	dev_priv->fb_mappable_pages = dev_priv->fb_available_size;
 	if (dev_priv->fb_mappable_pages > drm_get_resource_len(dev, 1))
 		dev_priv->fb_mappable_pages = drm_get_resource_len(dev, 1);
 	dev_priv->fb_mappable_pages >>= PAGE_SHIFT;
 
-	/* remove reserved space at end of vram from available amount */
+	/* reserve space at end of VRAM for PRAMIN */
+	if (dev_priv->chipset == 0x40 || dev_priv->chipset == 0x47 ||
+	    dev_priv->chipset == 0x49 || dev_priv->chipset == 0x4b)
+		dev_priv->ramin_rsvd_vram = (2 * 1024 * 1024);
+	else
+	if (dev_priv->card_type >= NV_40)
+		dev_priv->ramin_rsvd_vram = (1 * 1024 * 1024);
+	else
+		dev_priv->ramin_rsvd_vram = (512 * 1024);
+
 	dev_priv->fb_available_size -= dev_priv->ramin_rsvd_vram;
 	dev_priv->fb_aper_free = dev_priv->fb_available_size;
 
@@ -690,9 +602,23 @@ nouveau_mem_init(struct drm_device *dev)
 		nouveau_bo_ref(NULL, &dev_priv->vga_ram);
 	}
 
-	/* GART */
+	dev_priv->fb_mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 1),
+					 pci_resource_len(dev->pdev, 1),
+					 DRM_MTRR_WC);
+	return 0;
+}
+
+int
+nouveau_mem_gart_init(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct ttm_bo_device *bdev = &dev_priv->ttm.bdev;
+	int ret;
+
+	dev_priv->gart_info.type = NOUVEAU_GART_NONE;
+
 #if !defined(__powerpc__) && !defined(__ia64__)
-	if (drm_device_is_agp(dev) && dev->agp) {
+	if (drm_device_is_agp(dev) && dev->agp && nouveau_agpmode) {
 		ret = nouveau_mem_init_agp(dev);
 		if (ret)
 			NV_ERROR(dev, "Error initialising AGP: %d\n", ret);
@@ -718,11 +644,6 @@ nouveau_mem_init(struct drm_device *dev)
 		return ret;
 	}
 
-	dev_priv->fb_mtrr = drm_mtrr_add(drm_get_resource_start(dev, 1),
-					 drm_get_resource_len(dev, 1),
-					 DRM_MTRR_WC);
-
 	return 0;
 }
 
-
diff --git a/drivers/gpu/drm/nouveau/nouveau_notifier.c b/drivers/gpu/drm/nouveau/nouveau_notifier.c
index 9537f3e..22b8618 100644
--- a/drivers/gpu/drm/nouveau/nouveau_notifier.c
+++ b/drivers/gpu/drm/nouveau/nouveau_notifier.c
@@ -28,6 +28,7 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
+#include "nouveau_ramht.h"
 
 int
 nouveau_notifier_init_channel(struct nouveau_channel *chan)
@@ -55,7 +56,7 @@ nouveau_notifier_init_channel(struct nouveau_channel *chan)
 	if (ret)
 		goto out_err;
 
-	ret = nouveau_mem_init_heap(&chan->notifier_heap, 0, ntfy->bo.mem.size);
+	ret = drm_mm_init(&chan->notifier_heap, 0, ntfy->bo.mem.size);
 	if (ret)
 		goto out_err;
 
@@ -80,7 +81,7 @@ nouveau_notifier_takedown_channel(struct nouveau_channel *chan)
 	nouveau_bo_unpin(chan->notifier_bo);
 	mutex_unlock(&dev->struct_mutex);
 	drm_gem_object_unreference_unlocked(chan->notifier_bo->gem);
-	nouveau_mem_takedown(&chan->notifier_heap);
+	drm_mm_takedown(&chan->notifier_heap);
 }
 
 static void
@@ -90,7 +91,7 @@ nouveau_notifier_gpuobj_dtor(struct drm_device *dev,
 	NV_DEBUG(dev, "\n");
 
 	if (gpuobj->priv)
-		nouveau_mem_free_block(gpuobj->priv);
+		drm_mm_put_block(gpuobj->priv);
 }
 
 int
@@ -100,18 +101,13 @@ nouveau_notifier_alloc(struct nouveau_channel *chan, uint32_t handle,
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_gpuobj *nobj = NULL;
-	struct mem_block *mem;
+	struct drm_mm_node *mem;
 	uint32_t offset;
 	int target, ret;
 
-	if (!chan->notifier_heap) {
-		NV_ERROR(dev, "Channel %d doesn't have a notifier heap!\n",
-			 chan->id);
-		return -EINVAL;
-	}
-
-	mem = nouveau_mem_alloc_block(chan->notifier_heap, size, 0,
-				      (struct drm_file *)-2, 0);
+	mem = drm_mm_search_free(&chan->notifier_heap, size, 0, 0);
+	if (mem)
+		mem = drm_mm_get_block(mem, size, 0);
 	if (!mem) {
 		NV_ERROR(dev, "Channel %d notifier block full\n", chan->id);
 		return -ENOMEM;
@@ -144,18 +140,18 @@ nouveau_notifier_alloc(struct nouveau_channel *chan, uint32_t handle,
 				     mem->size, NV_DMA_ACCESS_RW, target,
 				     &nobj);
 	if (ret) {
-		nouveau_mem_free_block(mem);
+		drm_mm_put_block(mem);
 		NV_ERROR(dev, "Error creating notifier ctxdma: %d\n", ret);
 		return ret;
 	}
-	nobj->dtor   = nouveau_notifier_gpuobj_dtor;
-	nobj->priv   = mem;
+	nobj->dtor = nouveau_notifier_gpuobj_dtor;
+	nobj->priv = mem;
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, handle, nobj, NULL);
+	ret = nouveau_ramht_insert(chan, handle, nobj);
+	nouveau_gpuobj_ref(NULL, &nobj);
 	if (ret) {
-		nouveau_gpuobj_del(dev, &nobj);
-		nouveau_mem_free_block(mem);
-		NV_ERROR(dev, "Error referencing notifier ctxdma: %d\n", ret);
+		drm_mm_put_block(mem);
+		NV_ERROR(dev, "Error adding notifier to ramht: %d\n", ret);
 		return ret;
 	}
 
@@ -170,7 +166,7 @@ nouveau_notifier_offset(struct nouveau_gpuobj *nobj, uint32_t *poffset)
 		return -EINVAL;
 
 	if (poffset) {
-		struct mem_block *mem = nobj->priv;
+		struct drm_mm_node *mem = nobj->priv;
 
 		if (*poffset >= mem->size)
 			return false;
@@ -189,7 +185,6 @@ nouveau_ioctl_notifier_alloc(struct drm_device *dev, void *data,
 	struct nouveau_channel *chan;
 	int ret;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
 	NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(na->channel, file_priv, chan);
 
 	ret = nouveau_notifier_alloc(chan, na->handle, na->size, &na->offset);
diff --git a/drivers/gpu/drm/nouveau/nouveau_object.c b/drivers/gpu/drm/nouveau/nouveau_object.c
index e7c100b..115904d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_object.c
+++ b/drivers/gpu/drm/nouveau/nouveau_object.c
@@ -34,6 +34,7 @@
 #include "drm.h"
 #include "nouveau_drv.h"
 #include "nouveau_drm.h"
+#include "nouveau_ramht.h"
 
 /* NVidia uses context objects to drive drawing operations.
 
@@ -65,141 +66,6 @@
    The key into the hash table depends on the object handle and channel id and
    is given as:
 */
-static uint32_t
-nouveau_ramht_hash_handle(struct drm_device *dev, int channel, uint32_t handle)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t hash = 0;
-	int i;
-
-	NV_DEBUG(dev, "ch%d handle=0x%08x\n", channel, handle);
-
-	for (i = 32; i > 0; i -= dev_priv->ramht_bits) {
-		hash ^= (handle & ((1 << dev_priv->ramht_bits) - 1));
-		handle >>= dev_priv->ramht_bits;
-	}
-
-	if (dev_priv->card_type < NV_50)
-		hash ^= channel << (dev_priv->ramht_bits - 4);
-	hash <<= 3;
-
-	NV_DEBUG(dev, "hash=0x%08x\n", hash);
-	return hash;
-}
-
-static int
-nouveau_ramht_entry_valid(struct drm_device *dev, struct nouveau_gpuobj *ramht,
-			  uint32_t offset)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t ctx = nv_ro32(dev, ramht, (offset + 4)/4);
-
-	if (dev_priv->card_type < NV_40)
-		return ((ctx & NV_RAMHT_CONTEXT_VALID) != 0);
-	return (ctx != 0);
-}
-
-static int
-nouveau_ramht_insert(struct drm_device *dev, struct nouveau_gpuobj_ref *ref)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_instmem_engine *instmem = &dev_priv->engine.instmem;
-	struct nouveau_channel *chan = ref->channel;
-	struct nouveau_gpuobj *ramht = chan->ramht ? chan->ramht->gpuobj : NULL;
-	uint32_t ctx, co, ho;
-
-	if (!ramht) {
-		NV_ERROR(dev, "No hash table!\n");
-		return -EINVAL;
-	}
-
-	if (dev_priv->card_type < NV_40) {
-		ctx = NV_RAMHT_CONTEXT_VALID | (ref->instance >> 4) |
-		      (chan->id << NV_RAMHT_CONTEXT_CHANNEL_SHIFT) |
-		      (ref->gpuobj->engine << NV_RAMHT_CONTEXT_ENGINE_SHIFT);
-	} else
-	if (dev_priv->card_type < NV_50) {
-		ctx = (ref->instance >> 4) |
-		      (chan->id << NV40_RAMHT_CONTEXT_CHANNEL_SHIFT) |
-		      (ref->gpuobj->engine << NV40_RAMHT_CONTEXT_ENGINE_SHIFT);
-	} else {
-		if (ref->gpuobj->engine == NVOBJ_ENGINE_DISPLAY) {
-			ctx = (ref->instance << 10) | 2;
-		} else {
-			ctx = (ref->instance >> 4) |
-			      ((ref->gpuobj->engine <<
-				NV40_RAMHT_CONTEXT_ENGINE_SHIFT));
-		}
-	}
-
-	instmem->prepare_access(dev, true);
-	co = ho = nouveau_ramht_hash_handle(dev, chan->id, ref->handle);
-	do {
-		if (!nouveau_ramht_entry_valid(dev, ramht, co)) {
-			NV_DEBUG(dev,
-				 "insert ch%d 0x%08x: h=0x%08x, c=0x%08x\n",
-				 chan->id, co, ref->handle, ctx);
-			nv_wo32(dev, ramht, (co + 0)/4, ref->handle);
-			nv_wo32(dev, ramht, (co + 4)/4, ctx);
-
-			list_add_tail(&ref->list, &chan->ramht_refs);
-			instmem->finish_access(dev);
-			return 0;
-		}
-		NV_DEBUG(dev, "collision ch%d 0x%08x: h=0x%08x\n",
-			 chan->id, co, nv_ro32(dev, ramht, co/4));
-
-		co += 8;
-		if (co >= dev_priv->ramht_size)
-			co = 0;
-	} while (co != ho);
-	instmem->finish_access(dev);
-
-	NV_ERROR(dev, "RAMHT space exhausted. ch=%d\n", chan->id);
-	return -ENOMEM;
-}
-
-static void
-nouveau_ramht_remove(struct drm_device *dev, struct nouveau_gpuobj_ref *ref)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_instmem_engine *instmem = &dev_priv->engine.instmem;
-	struct nouveau_channel *chan = ref->channel;
-	struct nouveau_gpuobj *ramht = chan->ramht ? chan->ramht->gpuobj : NULL;
-	uint32_t co, ho;
-
-	if (!ramht) {
-		NV_ERROR(dev, "No hash table!\n");
-		return;
-	}
-
-	instmem->prepare_access(dev, true);
-	co = ho = nouveau_ramht_hash_handle(dev, chan->id, ref->handle);
-	do {
-		if (nouveau_ramht_entry_valid(dev, ramht, co) &&
-		    (ref->handle == nv_ro32(dev, ramht, (co/4)))) {
-			NV_DEBUG(dev,
-				 "remove ch%d 0x%08x: h=0x%08x, c=0x%08x\n",
-				 chan->id, co, ref->handle,
-				 nv_ro32(dev, ramht, (co + 4)));
-			nv_wo32(dev, ramht, (co + 0)/4, 0x00000000);
-			nv_wo32(dev, ramht, (co + 4)/4, 0x00000000);
-
-			list_del(&ref->list);
-			instmem->finish_access(dev);
-			return;
-		}
-
-		co += 8;
-		if (co >= dev_priv->ramht_size)
-			co = 0;
-	} while (co != ho);
-	list_del(&ref->list);
-	instmem->finish_access(dev);
-
-	NV_ERROR(dev, "RAMHT entry not found. ch=%d, handle=0x%08x\n",
-		 chan->id, ref->handle);
-}
 
 int
 nouveau_gpuobj_new(struct drm_device *dev, struct nouveau_channel *chan,
@@ -209,7 +75,7 @@ nouveau_gpuobj_new(struct drm_device *dev, struct nouveau_channel *chan,
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_engine *engine = &dev_priv->engine;
 	struct nouveau_gpuobj *gpuobj;
-	struct mem_block *pramin = NULL;
+	struct drm_mm_node *ramin = NULL;
 	int ret;
 
 	NV_DEBUG(dev, "ch%d size=%u align=%d flags=0x%08x\n",
@@ -222,82 +88,102 @@ nouveau_gpuobj_new(struct drm_device *dev, struct nouveau_channel *chan,
 	if (!gpuobj)
 		return -ENOMEM;
 	NV_DEBUG(dev, "gpuobj %p\n", gpuobj);
+	gpuobj->dev = dev;
 	gpuobj->flags = flags;
-	gpuobj->im_channel = chan;
+	kref_init(&gpuobj->refcount);
+	gpuobj->size = size;
 
+	spin_lock(&dev_priv->ramin_lock);
 	list_add_tail(&gpuobj->list, &dev_priv->gpuobj_list);
+	spin_unlock(&dev_priv->ramin_lock);
 
-	/* Choose between global instmem heap, and per-channel private
-	 * instmem heap.  On <NV50 allow requests for private instmem
-	 * to be satisfied from global heap if no per-channel area
-	 * available.
-	 */
 	if (chan) {
-		if (chan->ramin_heap) {
-			NV_DEBUG(dev, "private heap\n");
-			pramin = chan->ramin_heap;
-		} else
-		if (dev_priv->card_type < NV_50) {
-			NV_DEBUG(dev, "global heap fallback\n");
-			pramin = dev_priv->ramin_heap;
+		NV_DEBUG(dev, "channel heap\n");
+
+		ramin = drm_mm_search_free(&chan->ramin_heap, size, align, 0);
+		if (ramin)
+			ramin = drm_mm_get_block(ramin, size, align);
+
+		if (!ramin) {
+			nouveau_gpuobj_ref(NULL, &gpuobj);
+			return -ENOMEM;
 		}
 	} else {
 		NV_DEBUG(dev, "global heap\n");
-		pramin = dev_priv->ramin_heap;
-	}
 
-	if (!pramin) {
-		NV_ERROR(dev, "No PRAMIN heap!\n");
-		return -EINVAL;
-	}
-
-	if (!chan) {
+		/* allocate backing pages, sets vinst */
 		ret = engine->instmem.populate(dev, gpuobj, &size);
 		if (ret) {
-			nouveau_gpuobj_del(dev, &gpuobj);
+			nouveau_gpuobj_ref(NULL, &gpuobj);
 			return ret;
 		}
-	}
 
-	/* Allocate a chunk of the PRAMIN aperture */
-	gpuobj->im_pramin = nouveau_mem_alloc_block(pramin, size,
-						    drm_order(align),
-						    (struct drm_file *)-2, 0);
-	if (!gpuobj->im_pramin) {
-		nouveau_gpuobj_del(dev, &gpuobj);
-		return -ENOMEM;
+		/* try and get aperture space */
+		do {
+			if (drm_mm_pre_get(&dev_priv->ramin_heap))
+				return -ENOMEM;
+
+			spin_lock(&dev_priv->ramin_lock);
+			ramin = drm_mm_search_free(&dev_priv->ramin_heap, size,
+						   align, 0);
+			if (ramin == NULL) {
+				spin_unlock(&dev_priv->ramin_lock);
+				nouveau_gpuobj_ref(NULL, &gpuobj);
+				return ret;
+			}
+
+			ramin = drm_mm_get_block_atomic(ramin, size, align);
+			spin_unlock(&dev_priv->ramin_lock);
+		} while (ramin == NULL);
+
+		/* on nv50 it's ok to fail, we have a fallback path */
+		if (!ramin && dev_priv->card_type < NV_50) {
+			nouveau_gpuobj_ref(NULL, &gpuobj);
+			return -ENOMEM;
+		}
 	}
 
-	if (!chan) {
+	/* if we got a chunk of the aperture, map pages into it */
+	gpuobj->im_pramin = ramin;
+	if (!chan && gpuobj->im_pramin && dev_priv->ramin_available) {
 		ret = engine->instmem.bind(dev, gpuobj);
 		if (ret) {
-			nouveau_gpuobj_del(dev, &gpuobj);
+			nouveau_gpuobj_ref(NULL, &gpuobj);
 			return ret;
 		}
 	}
 
-	if (gpuobj->flags & NVOBJ_FLAG_ZERO_ALLOC) {
-		int i;
+	/* calculate the various different addresses for the object */
+	if (chan) {
+		gpuobj->pinst = chan->ramin->pinst;
+		if (gpuobj->pinst != ~0)
+			gpuobj->pinst += gpuobj->im_pramin->start;
 
-		engine->instmem.prepare_access(dev, true);
-		for (i = 0; i < gpuobj->im_pramin->size; i += 4)
-			nv_wo32(dev, gpuobj, i/4, 0);
-		engine->instmem.finish_access(dev);
+		if (dev_priv->card_type < NV_50) {
+			gpuobj->cinst = gpuobj->pinst;
+		} else {
+			gpuobj->cinst = gpuobj->im_pramin->start;
+			gpuobj->vinst = gpuobj->im_pramin->start +
+					chan->ramin->vinst;
+		}
+	} else {
+		if (gpuobj->im_pramin)
+			gpuobj->pinst = gpuobj->im_pramin->start;
+		else
+			gpuobj->pinst = ~0;
+		gpuobj->cinst = 0xdeadbeef;
 	}
 
-	*gpuobj_ret = gpuobj;
-	return 0;
-}
-
-int
-nouveau_gpuobj_early_init(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	if (gpuobj->flags & NVOBJ_FLAG_ZERO_ALLOC) {
+		int i;
 
-	NV_DEBUG(dev, "\n");
+		for (i = 0; i < gpuobj->size; i += 4)
+			nv_wo32(gpuobj, i, 0);
+		engine->instmem.flush(dev);
+	}
 
-	INIT_LIST_HEAD(&dev_priv->gpuobj_list);
 
+	*gpuobj_ret = gpuobj;
 	return 0;
 }
 
@@ -305,18 +191,12 @@ int
 nouveau_gpuobj_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	int ret;
 
 	NV_DEBUG(dev, "\n");
 
-	if (dev_priv->card_type < NV_50) {
-		ret = nouveau_gpuobj_new_fake(dev,
-			dev_priv->ramht_offset, ~0, dev_priv->ramht_size,
-			NVOBJ_FLAG_ZERO_ALLOC | NVOBJ_FLAG_ALLOW_NO_REFS,
-						&dev_priv->ramht, NULL);
-		if (ret)
-			return ret;
-	}
+	INIT_LIST_HEAD(&dev_priv->gpuobj_list);
+	spin_lock_init(&dev_priv->ramin_lock);
+	dev_priv->ramin_base = ~0;
 
 	return 0;
 }
@@ -328,299 +208,89 @@ nouveau_gpuobj_takedown(struct drm_device *dev)
 
 	NV_DEBUG(dev, "\n");
 
-	nouveau_gpuobj_del(dev, &dev_priv->ramht);
+	BUG_ON(!list_empty(&dev_priv->gpuobj_list));
 }
 
-void
-nouveau_gpuobj_late_takedown(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *gpuobj = NULL;
-	struct list_head *entry, *tmp;
-
-	NV_DEBUG(dev, "\n");
-
-	list_for_each_safe(entry, tmp, &dev_priv->gpuobj_list) {
-		gpuobj = list_entry(entry, struct nouveau_gpuobj, list);
 
-		NV_ERROR(dev, "gpuobj %p still exists at takedown, refs=%d\n",
-			 gpuobj, gpuobj->refcount);
-		gpuobj->refcount = 0;
-		nouveau_gpuobj_del(dev, &gpuobj);
-	}
-}
-
-int
-nouveau_gpuobj_del(struct drm_device *dev, struct nouveau_gpuobj **pgpuobj)
+static void
+nouveau_gpuobj_del(struct kref *ref)
 {
+	struct nouveau_gpuobj *gpuobj =
+		container_of(ref, struct nouveau_gpuobj, refcount);
+	struct drm_device *dev = gpuobj->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_engine *engine = &dev_priv->engine;
-	struct nouveau_gpuobj *gpuobj;
 	int i;
 
-	NV_DEBUG(dev, "gpuobj %p\n", pgpuobj ? *pgpuobj : NULL);
-
-	if (!dev_priv || !pgpuobj || !(*pgpuobj))
-		return -EINVAL;
-	gpuobj = *pgpuobj;
-
-	if (gpuobj->refcount != 0) {
-		NV_ERROR(dev, "gpuobj refcount is %d\n", gpuobj->refcount);
-		return -EINVAL;
-	}
+	NV_DEBUG(dev, "gpuobj %p\n", gpuobj);
 
 	if (gpuobj->im_pramin && (gpuobj->flags & NVOBJ_FLAG_ZERO_FREE)) {
-		engine->instmem.prepare_access(dev, true);
-		for (i = 0; i < gpuobj->im_pramin->size; i += 4)
-			nv_wo32(dev, gpuobj, i/4, 0);
-		engine->instmem.finish_access(dev);
+		for (i = 0; i < gpuobj->size; i += 4)
+			nv_wo32(gpuobj, i, 0);
+		engine->instmem.flush(dev);
 	}
 
 	if (gpuobj->dtor)
 		gpuobj->dtor(dev, gpuobj);
 
-	if (gpuobj->im_backing && !(gpuobj->flags & NVOBJ_FLAG_FAKE))
+	if (gpuobj->im_backing)
 		engine->instmem.clear(dev, gpuobj);
 
-	if (gpuobj->im_pramin) {
-		if (gpuobj->flags & NVOBJ_FLAG_FAKE)
-			kfree(gpuobj->im_pramin);
-		else
-			nouveau_mem_free_block(gpuobj->im_pramin);
-	}
-
+	spin_lock(&dev_priv->ramin_lock);
+	if (gpuobj->im_pramin)
+		drm_mm_put_block(gpuobj->im_pramin);
 	list_del(&gpuobj->list);
+	spin_unlock(&dev_priv->ramin_lock);
 
-	*pgpuobj = NULL;
 	kfree(gpuobj);
-	return 0;
-}
-
-static int
-nouveau_gpuobj_instance_get(struct drm_device *dev,
-			    struct nouveau_channel *chan,
-			    struct nouveau_gpuobj *gpuobj, uint32_t *inst)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *cpramin;
-
-	/* <NV50 use PRAMIN address everywhere */
-	if (dev_priv->card_type < NV_50) {
-		*inst = gpuobj->im_pramin->start;
-		return 0;
-	}
-
-	if (chan && gpuobj->im_channel != chan) {
-		NV_ERROR(dev, "Channel mismatch: obj %d, ref %d\n",
-			 gpuobj->im_channel->id, chan->id);
-		return -EINVAL;
-	}
-
-	/* NV50 channel-local instance */
-	if (chan) {
-		cpramin = chan->ramin->gpuobj;
-		*inst = gpuobj->im_pramin->start - cpramin->im_pramin->start;
-		return 0;
-	}
-
-	/* NV50 global (VRAM) instance */
-	if (!gpuobj->im_channel) {
-		/* ...from global heap */
-		if (!gpuobj->im_backing) {
-			NV_ERROR(dev, "AII, no VRAM backing gpuobj\n");
-			return -EINVAL;
-		}
-		*inst = gpuobj->im_backing_start;
-		return 0;
-	} else {
-		/* ...from local heap */
-		cpramin = gpuobj->im_channel->ramin->gpuobj;
-		*inst = cpramin->im_backing_start +
-			(gpuobj->im_pramin->start - cpramin->im_pramin->start);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-int
-nouveau_gpuobj_ref_add(struct drm_device *dev, struct nouveau_channel *chan,
-		       uint32_t handle, struct nouveau_gpuobj *gpuobj,
-		       struct nouveau_gpuobj_ref **ref_ret)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj_ref *ref;
-	uint32_t instance;
-	int ret;
-
-	NV_DEBUG(dev, "ch%d h=0x%08x gpuobj=%p\n",
-		 chan ? chan->id : -1, handle, gpuobj);
-
-	if (!dev_priv || !gpuobj || (ref_ret && *ref_ret != NULL))
-		return -EINVAL;
-
-	if (!chan && !ref_ret)
-		return -EINVAL;
-
-	if (gpuobj->engine == NVOBJ_ENGINE_SW && !gpuobj->im_pramin) {
-		/* sw object */
-		instance = 0x40;
-	} else {
-		ret = nouveau_gpuobj_instance_get(dev, chan, gpuobj, &instance);
-		if (ret)
-			return ret;
-	}
-
-	ref = kzalloc(sizeof(*ref), GFP_KERNEL);
-	if (!ref)
-		return -ENOMEM;
-	INIT_LIST_HEAD(&ref->list);
-	ref->gpuobj   = gpuobj;
-	ref->channel  = chan;
-	ref->instance = instance;
-
-	if (!ref_ret) {
-		ref->handle = handle;
-
-		ret = nouveau_ramht_insert(dev, ref);
-		if (ret) {
-			kfree(ref);
-			return ret;
-		}
-	} else {
-		ref->handle = ~0;
-		*ref_ret = ref;
-	}
-
-	ref->gpuobj->refcount++;
-	return 0;
 }
 
-int nouveau_gpuobj_ref_del(struct drm_device *dev, struct nouveau_gpuobj_ref **pref)
-{
-	struct nouveau_gpuobj_ref *ref;
-
-	NV_DEBUG(dev, "ref %p\n", pref ? *pref : NULL);
-
-	if (!dev || !pref || *pref == NULL)
-		return -EINVAL;
-	ref = *pref;
-
-	if (ref->handle != ~0)
-		nouveau_ramht_remove(dev, ref);
-
-	if (ref->gpuobj) {
-		ref->gpuobj->refcount--;
-
-		if (ref->gpuobj->refcount == 0) {
-			if (!(ref->gpuobj->flags & NVOBJ_FLAG_ALLOW_NO_REFS))
-				nouveau_gpuobj_del(dev, &ref->gpuobj);
-		}
-	}
-
-	*pref = NULL;
-	kfree(ref);
-	return 0;
-}
-
-int
-nouveau_gpuobj_new_ref(struct drm_device *dev,
-		       struct nouveau_channel *oc, struct nouveau_channel *rc,
-		       uint32_t handle, uint32_t size, int align,
-		       uint32_t flags, struct nouveau_gpuobj_ref **ref)
-{
-	struct nouveau_gpuobj *gpuobj = NULL;
-	int ret;
-
-	ret = nouveau_gpuobj_new(dev, oc, size, align, flags, &gpuobj);
-	if (ret)
-		return ret;
-
-	ret = nouveau_gpuobj_ref_add(dev, rc, handle, gpuobj, ref);
-	if (ret) {
-		nouveau_gpuobj_del(dev, &gpuobj);
-		return ret;
-	}
-
-	return 0;
-}
-
-int
-nouveau_gpuobj_ref_find(struct nouveau_channel *chan, uint32_t handle,
-			struct nouveau_gpuobj_ref **ref_ret)
+void
+nouveau_gpuobj_ref(struct nouveau_gpuobj *ref, struct nouveau_gpuobj **ptr)
 {
-	struct nouveau_gpuobj_ref *ref;
-	struct list_head *entry, *tmp;
+	if (ref)
+		kref_get(&ref->refcount);
 
-	list_for_each_safe(entry, tmp, &chan->ramht_refs) {
-		ref = list_entry(entry, struct nouveau_gpuobj_ref, list);
+	if (*ptr)
+		kref_put(&(*ptr)->refcount, nouveau_gpuobj_del);
 
-		if (ref->handle == handle) {
-			if (ref_ret)
-				*ref_ret = ref;
-			return 0;
-		}
-	}
-
-	return -EINVAL;
+	*ptr = ref;
 }
 
 int
-nouveau_gpuobj_new_fake(struct drm_device *dev, uint32_t p_offset,
-			uint32_t b_offset, uint32_t size,
-			uint32_t flags, struct nouveau_gpuobj **pgpuobj,
-			struct nouveau_gpuobj_ref **pref)
+nouveau_gpuobj_new_fake(struct drm_device *dev, u32 pinst, u64 vinst,
+			u32 size, u32 flags, struct nouveau_gpuobj **pgpuobj)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_gpuobj *gpuobj = NULL;
 	int i;
 
 	NV_DEBUG(dev,
-		 "p_offset=0x%08x b_offset=0x%08x size=0x%08x flags=0x%08x\n",
-		 p_offset, b_offset, size, flags);
+		 "pinst=0x%08x vinst=0x%010llx size=0x%08x flags=0x%08x\n",
+		 pinst, vinst, size, flags);
 
 	gpuobj = kzalloc(sizeof(*gpuobj), GFP_KERNEL);
 	if (!gpuobj)
 		return -ENOMEM;
 	NV_DEBUG(dev, "gpuobj %p\n", gpuobj);
-	gpuobj->im_channel = NULL;
-	gpuobj->flags      = flags | NVOBJ_FLAG_FAKE;
-
-	list_add_tail(&gpuobj->list, &dev_priv->gpuobj_list);
-
-	if (p_offset != ~0) {
-		gpuobj->im_pramin = kzalloc(sizeof(struct mem_block),
-					    GFP_KERNEL);
-		if (!gpuobj->im_pramin) {
-			nouveau_gpuobj_del(dev, &gpuobj);
-			return -ENOMEM;
-		}
-		gpuobj->im_pramin->start = p_offset;
-		gpuobj->im_pramin->size  = size;
-	}
-
-	if (b_offset != ~0) {
-		gpuobj->im_backing = (struct nouveau_bo *)-1;
-		gpuobj->im_backing_start = b_offset;
-	}
+	gpuobj->dev = dev;
+	gpuobj->flags = flags;
+	kref_init(&gpuobj->refcount);
+	gpuobj->size  = size;
+	gpuobj->pinst = pinst;
+	gpuobj->cinst = 0xdeadbeef;
+	gpuobj->vinst = vinst;
 
 	if (gpuobj->flags & NVOBJ_FLAG_ZERO_ALLOC) {
-		dev_priv->engine.instmem.prepare_access(dev, true);
-		for (i = 0; i < gpuobj->im_pramin->size; i += 4)
-			nv_wo32(dev, gpuobj, i/4, 0);
-		dev_priv->engine.instmem.finish_access(dev);
+		for (i = 0; i < gpuobj->size; i += 4)
+			nv_wo32(gpuobj, i, 0);
+		dev_priv->engine.instmem.flush(dev);
 	}
 
-	if (pref) {
-		i = nouveau_gpuobj_ref_add(dev, NULL, 0, gpuobj, pref);
-		if (i) {
-			nouveau_gpuobj_del(dev, &gpuobj);
-			return i;
-		}
-	}
-
-	if (pgpuobj)
-		*pgpuobj = gpuobj;
+	spin_lock(&dev_priv->ramin_lock);
+	list_add_tail(&gpuobj->list, &dev_priv->gpuobj_list);
+	spin_unlock(&dev_priv->ramin_lock);
+	*pgpuobj = gpuobj;
 	return 0;
 }
 
@@ -696,8 +366,6 @@ nouveau_gpuobj_dma_new(struct nouveau_channel *chan, int class,
 		return ret;
 	}
 
-	instmem->prepare_access(dev, true);
-
 	if (dev_priv->card_type < NV_50) {
 		uint32_t frame, adjust, pte_flags = 0;
 
@@ -706,14 +374,12 @@ nouveau_gpuobj_dma_new(struct nouveau_channel *chan, int class,
 		adjust = offset &  0x00000fff;
 		frame  = offset & ~0x00000fff;
 
-		nv_wo32(dev, *gpuobj, 0, ((1<<12) | (1<<13) |
-				(adjust << 20) |
-				 (access << 14) |
-				 (target << 16) |
-				  class));
-		nv_wo32(dev, *gpuobj, 1, size - 1);
-		nv_wo32(dev, *gpuobj, 2, frame | pte_flags);
-		nv_wo32(dev, *gpuobj, 3, frame | pte_flags);
+		nv_wo32(*gpuobj,  0, ((1<<12) | (1<<13) | (adjust << 20) |
+				      (access << 14) | (target << 16) |
+				      class));
+		nv_wo32(*gpuobj,  4, size - 1);
+		nv_wo32(*gpuobj,  8, frame | pte_flags);
+		nv_wo32(*gpuobj, 12, frame | pte_flags);
 	} else {
 		uint64_t limit = offset + size - 1;
 		uint32_t flags0, flags5;
@@ -726,15 +392,15 @@ nouveau_gpuobj_dma_new(struct nouveau_channel *chan, int class,
 			flags5 = 0x00080000;
 		}
 
-		nv_wo32(dev, *gpuobj, 0, flags0 | class);
-		nv_wo32(dev, *gpuobj, 1, lower_32_bits(limit));
-		nv_wo32(dev, *gpuobj, 2, lower_32_bits(offset));
-		nv_wo32(dev, *gpuobj, 3, ((upper_32_bits(limit) & 0xff) << 24) |
-					(upper_32_bits(offset) & 0xff));
-		nv_wo32(dev, *gpuobj, 5, flags5);
+		nv_wo32(*gpuobj,  0, flags0 | class);
+		nv_wo32(*gpuobj,  4, lower_32_bits(limit));
+		nv_wo32(*gpuobj,  8, lower_32_bits(offset));
+		nv_wo32(*gpuobj, 12, ((upper_32_bits(limit) & 0xff) << 24) |
+				      (upper_32_bits(offset) & 0xff));
+		nv_wo32(*gpuobj, 20, flags5);
 	}
 
-	instmem->finish_access(dev);
+	instmem->flush(dev);
 
 	(*gpuobj)->engine = NVOBJ_ENGINE_SW;
 	(*gpuobj)->class  = class;
@@ -762,7 +428,7 @@ nouveau_gpuobj_gart_dma_new(struct nouveau_channel *chan,
 			*o_ret = 0;
 	} else
 	if (dev_priv->gart_info.type == NOUVEAU_GART_SGDMA) {
-		*gpuobj = dev_priv->gart_info.sg_ctxdma;
+		nouveau_gpuobj_ref(dev_priv->gart_info.sg_ctxdma, gpuobj);
 		if (offset & ~0xffffffffULL) {
 			NV_ERROR(dev, "obj offset exceeds 32-bits\n");
 			return -EINVAL;
@@ -849,32 +515,31 @@ nouveau_gpuobj_gr_new(struct nouveau_channel *chan, int class,
 		return ret;
 	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	if (dev_priv->card_type >= NV_50) {
-		nv_wo32(dev, *gpuobj, 0, class);
-		nv_wo32(dev, *gpuobj, 5, 0x00010000);
+		nv_wo32(*gpuobj,  0, class);
+		nv_wo32(*gpuobj, 20, 0x00010000);
 	} else {
 		switch (class) {
 		case NV_CLASS_NULL:
-			nv_wo32(dev, *gpuobj, 0, 0x00001030);
-			nv_wo32(dev, *gpuobj, 1, 0xFFFFFFFF);
+			nv_wo32(*gpuobj, 0, 0x00001030);
+			nv_wo32(*gpuobj, 4, 0xFFFFFFFF);
 			break;
 		default:
 			if (dev_priv->card_type >= NV_40) {
-				nv_wo32(dev, *gpuobj, 0, class);
+				nv_wo32(*gpuobj, 0, class);
 #ifdef __BIG_ENDIAN
-				nv_wo32(dev, *gpuobj, 2, 0x01000000);
+				nv_wo32(*gpuobj, 8, 0x01000000);
 #endif
 			} else {
 #ifdef __BIG_ENDIAN
-				nv_wo32(dev, *gpuobj, 0, class | 0x00080000);
+				nv_wo32(*gpuobj, 0, class | 0x00080000);
 #else
-				nv_wo32(dev, *gpuobj, 0, class);
+				nv_wo32(*gpuobj, 0, class);
 #endif
 			}
 		}
 	}
-	dev_priv->engine.instmem.finish_access(dev);
+	dev_priv->engine.instmem.flush(dev);
 
 	(*gpuobj)->engine = NVOBJ_ENGINE_GR;
 	(*gpuobj)->class  = class;
@@ -895,10 +560,15 @@ nouveau_gpuobj_sw_new(struct nouveau_channel *chan, int class,
 	gpuobj = kzalloc(sizeof(*gpuobj), GFP_KERNEL);
 	if (!gpuobj)
 		return -ENOMEM;
+	gpuobj->dev = chan->dev;
 	gpuobj->engine = NVOBJ_ENGINE_SW;
 	gpuobj->class = class;
+	kref_init(&gpuobj->refcount);
+	gpuobj->cinst = 0x40;
 
+	spin_lock(&dev_priv->ramin_lock);
 	list_add_tail(&gpuobj->list, &dev_priv->gpuobj_list);
+	spin_unlock(&dev_priv->ramin_lock);
 	*gpuobj_ret = gpuobj;
 	return 0;
 }
@@ -908,7 +578,6 @@ nouveau_gpuobj_channel_init_pramin(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *pramin = NULL;
 	uint32_t size;
 	uint32_t base;
 	int ret;
@@ -920,6 +589,7 @@ nouveau_gpuobj_channel_init_pramin(struct nouveau_channel *chan)
 	base = 0;
 
 	/* PGRAPH context */
+	size += dev_priv->engine.graph.grctx_size;
 
 	if (dev_priv->card_type == NV_50) {
 		/* Various fixed table thingos */
@@ -930,25 +600,18 @@ nouveau_gpuobj_channel_init_pramin(struct nouveau_channel *chan)
 		size += 0x8000;
 		/* RAMFC */
 		size += 0x1000;
-		/* PGRAPH context */
-		size += 0x70000;
 	}
 
-	NV_DEBUG(dev, "ch%d PRAMIN size: 0x%08x bytes, base alloc=0x%08x\n",
-		 chan->id, size, base);
-	ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0, size, 0x1000, 0,
-				     &chan->ramin);
+	ret = nouveau_gpuobj_new(dev, NULL, size, 0x1000, 0, &chan->ramin);
 	if (ret) {
 		NV_ERROR(dev, "Error allocating channel PRAMIN: %d\n", ret);
 		return ret;
 	}
-	pramin = chan->ramin->gpuobj;
 
-	ret = nouveau_mem_init_heap(&chan->ramin_heap,
-				    pramin->im_pramin->start + base, size);
+	ret = drm_mm_init(&chan->ramin_heap, base, size);
 	if (ret) {
 		NV_ERROR(dev, "Error creating PRAMIN heap: %d\n", ret);
-		nouveau_gpuobj_ref_del(dev, &chan->ramin);
+		nouveau_gpuobj_ref(NULL, &chan->ramin);
 		return ret;
 	}
 
@@ -965,19 +628,13 @@ nouveau_gpuobj_channel_init(struct nouveau_channel *chan,
 	struct nouveau_gpuobj *vram = NULL, *tt = NULL;
 	int ret, i;
 
-	INIT_LIST_HEAD(&chan->ramht_refs);
-
 	NV_DEBUG(dev, "ch%d vram=0x%08x tt=0x%08x\n", chan->id, vram_h, tt_h);
 
-	/* Reserve a block of PRAMIN for the channel
-	 *XXX: maybe on <NV50 too at some point
-	 */
-	if (0 || dev_priv->card_type == NV_50) {
-		ret = nouveau_gpuobj_channel_init_pramin(chan);
-		if (ret) {
-			NV_ERROR(dev, "init pramin\n");
-			return ret;
-		}
+	/* Allocate a chunk of memory for per-channel object storage */
+	ret = nouveau_gpuobj_channel_init_pramin(chan);
+	if (ret) {
+		NV_ERROR(dev, "init pramin\n");
+		return ret;
 	}
 
 	/* NV50 VM
@@ -986,65 +643,56 @@ nouveau_gpuobj_channel_init(struct nouveau_channel *chan,
 	 *    locations determined during init.
 	 */
 	if (dev_priv->card_type >= NV_50) {
-		uint32_t vm_offset, pde;
-
-		instmem->prepare_access(dev, true);
+		u32 pgd_offs = (dev_priv->chipset == 0x50) ? 0x1400 : 0x0200;
+		u64 vm_vinst = chan->ramin->vinst + pgd_offs;
+		u32 vm_pinst = chan->ramin->pinst;
+		u32 pde;
 
-		vm_offset = (dev_priv->chipset & 0xf0) == 0x50 ? 0x1400 : 0x200;
-		vm_offset += chan->ramin->gpuobj->im_pramin->start;
+		if (vm_pinst != ~0)
+			vm_pinst += pgd_offs;
 
-		ret = nouveau_gpuobj_new_fake(dev, vm_offset, ~0, 0x4000,
-							0, &chan->vm_pd, NULL);
-		if (ret) {
-			instmem->finish_access(dev);
+		ret = nouveau_gpuobj_new_fake(dev, vm_pinst, vm_vinst, 0x4000,
+					      0, &chan->vm_pd);
+		if (ret)
 			return ret;
-		}
 		for (i = 0; i < 0x4000; i += 8) {
-			nv_wo32(dev, chan->vm_pd, (i+0)/4, 0x00000000);
-			nv_wo32(dev, chan->vm_pd, (i+4)/4, 0xdeadcafe);
+			nv_wo32(chan->vm_pd, i + 0, 0x00000000);
+			nv_wo32(chan->vm_pd, i + 4, 0xdeadcafe);
 		}
 
-		pde = (dev_priv->vm_gart_base / (512*1024*1024)) * 2;
-		ret = nouveau_gpuobj_ref_add(dev, NULL, 0,
-					     dev_priv->gart_info.sg_ctxdma,
-					     &chan->vm_gart_pt);
-		if (ret) {
-			instmem->finish_access(dev);
-			return ret;
-		}
-		nv_wo32(dev, chan->vm_pd, pde++,
-			    chan->vm_gart_pt->instance | 0x03);
-		nv_wo32(dev, chan->vm_pd, pde++, 0x00000000);
+		nouveau_gpuobj_ref(dev_priv->gart_info.sg_ctxdma,
+				   &chan->vm_gart_pt);
+		pde = (dev_priv->vm_gart_base / (512*1024*1024)) * 8;
+		nv_wo32(chan->vm_pd, pde + 0, chan->vm_gart_pt->vinst | 3);
+		nv_wo32(chan->vm_pd, pde + 4, 0x00000000);
 
-		pde = (dev_priv->vm_vram_base / (512*1024*1024)) * 2;
+		pde = (dev_priv->vm_vram_base / (512*1024*1024)) * 8;
 		for (i = 0; i < dev_priv->vm_vram_pt_nr; i++) {
-			ret = nouveau_gpuobj_ref_add(dev, NULL, 0,
-						     dev_priv->vm_vram_pt[i],
-						     &chan->vm_vram_pt[i]);
-			if (ret) {
-				instmem->finish_access(dev);
-				return ret;
-			}
+			nouveau_gpuobj_ref(dev_priv->vm_vram_pt[i],
+					   &chan->vm_vram_pt[i]);
 
-			nv_wo32(dev, chan->vm_pd, pde++,
-				    chan->vm_vram_pt[i]->instance | 0x61);
-			nv_wo32(dev, chan->vm_pd, pde++, 0x00000000);
+			nv_wo32(chan->vm_pd, pde + 0,
+				chan->vm_vram_pt[i]->vinst | 0x61);
+			nv_wo32(chan->vm_pd, pde + 4, 0x00000000);
+			pde += 8;
 		}
 
-		instmem->finish_access(dev);
+		instmem->flush(dev);
 	}
 
 	/* RAMHT */
 	if (dev_priv->card_type < NV_50) {
-		ret = nouveau_gpuobj_ref_add(dev, NULL, 0, dev_priv->ramht,
-					     &chan->ramht);
+		nouveau_ramht_ref(dev_priv->ramht, &chan->ramht, NULL);
+	} else {
+		struct nouveau_gpuobj *ramht = NULL;
+
+		ret = nouveau_gpuobj_new(dev, chan, 0x8000, 16,
+					 NVOBJ_FLAG_ZERO_ALLOC, &ramht);
 		if (ret)
 			return ret;
-	} else {
-		ret = nouveau_gpuobj_new_ref(dev, chan, chan, 0,
-					     0x8000, 16,
-					     NVOBJ_FLAG_ZERO_ALLOC,
-					     &chan->ramht);
+
+		ret = nouveau_ramht_new(dev, ramht, &chan->ramht);
+		nouveau_gpuobj_ref(NULL, &ramht);
 		if (ret)
 			return ret;
 	}
@@ -1061,24 +709,32 @@ nouveau_gpuobj_channel_init(struct nouveau_channel *chan,
 		}
 	} else {
 		ret = nouveau_gpuobj_dma_new(chan, NV_CLASS_DMA_IN_MEMORY,
-						0, dev_priv->fb_available_size,
-						NV_DMA_ACCESS_RW,
-						NV_DMA_TARGET_VIDMEM, &vram);
+					     0, dev_priv->fb_available_size,
+					     NV_DMA_ACCESS_RW,
+					     NV_DMA_TARGET_VIDMEM, &vram);
 		if (ret) {
 			NV_ERROR(dev, "Error creating VRAM ctxdma: %d\n", ret);
 			return ret;
 		}
 	}
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, vram_h, vram, NULL);
+	ret = nouveau_ramht_insert(chan, vram_h, vram);
+	nouveau_gpuobj_ref(NULL, &vram);
 	if (ret) {
-		NV_ERROR(dev, "Error referencing VRAM ctxdma: %d\n", ret);
+		NV_ERROR(dev, "Error adding VRAM ctxdma to RAMHT: %d\n", ret);
 		return ret;
 	}
 
 	/* TT memory ctxdma */
 	if (dev_priv->card_type >= NV_50) {
-		tt = vram;
+		ret = nouveau_gpuobj_dma_new(chan, NV_CLASS_DMA_IN_MEMORY,
+					     0, dev_priv->vm_end,
+					     NV_DMA_ACCESS_RW,
+					     NV_DMA_TARGET_AGP, &tt);
+		if (ret) {
+			NV_ERROR(dev, "Error creating VRAM ctxdma: %d\n", ret);
+			return ret;
+		}
 	} else
 	if (dev_priv->gart_info.type != NOUVEAU_GART_NONE) {
 		ret = nouveau_gpuobj_gart_dma_new(chan, 0,
@@ -1094,9 +750,10 @@ nouveau_gpuobj_channel_init(struct nouveau_channel *chan,
 		return ret;
 	}
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, tt_h, tt, NULL);
+	ret = nouveau_ramht_insert(chan, tt_h, tt);
+	nouveau_gpuobj_ref(NULL, &tt);
 	if (ret) {
-		NV_ERROR(dev, "Error referencing TT ctxdma: %d\n", ret);
+		NV_ERROR(dev, "Error adding TT ctxdma to RAMHT: %d\n", ret);
 		return ret;
 	}
 
@@ -1108,33 +765,23 @@ nouveau_gpuobj_channel_takedown(struct nouveau_channel *chan)
 {
 	struct drm_nouveau_private *dev_priv = chan->dev->dev_private;
 	struct drm_device *dev = chan->dev;
-	struct list_head *entry, *tmp;
-	struct nouveau_gpuobj_ref *ref;
 	int i;
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
-	if (!chan->ramht_refs.next)
+	if (!chan->ramht)
 		return;
 
-	list_for_each_safe(entry, tmp, &chan->ramht_refs) {
-		ref = list_entry(entry, struct nouveau_gpuobj_ref, list);
+	nouveau_ramht_ref(NULL, &chan->ramht, chan);
 
-		nouveau_gpuobj_ref_del(dev, &ref);
-	}
-
-	nouveau_gpuobj_ref_del(dev, &chan->ramht);
-
-	nouveau_gpuobj_del(dev, &chan->vm_pd);
-	nouveau_gpuobj_ref_del(dev, &chan->vm_gart_pt);
+	nouveau_gpuobj_ref(NULL, &chan->vm_pd);
+	nouveau_gpuobj_ref(NULL, &chan->vm_gart_pt);
 	for (i = 0; i < dev_priv->vm_vram_pt_nr; i++)
-		nouveau_gpuobj_ref_del(dev, &chan->vm_vram_pt[i]);
-
-	if (chan->ramin_heap)
-		nouveau_mem_takedown(&chan->ramin_heap);
-	if (chan->ramin)
-		nouveau_gpuobj_ref_del(dev, &chan->ramin);
+		nouveau_gpuobj_ref(NULL, &chan->vm_vram_pt[i]);
 
+	if (chan->ramin_heap.fl_entry.next)
+		drm_mm_takedown(&chan->ramin_heap);
+	nouveau_gpuobj_ref(NULL, &chan->ramin);
 }
 
 int
@@ -1155,19 +802,17 @@ nouveau_gpuobj_suspend(struct drm_device *dev)
 	}
 
 	list_for_each_entry(gpuobj, &dev_priv->gpuobj_list, list) {
-		if (!gpuobj->im_backing || (gpuobj->flags & NVOBJ_FLAG_FAKE))
+		if (!gpuobj->im_backing)
 			continue;
 
-		gpuobj->im_backing_suspend = vmalloc(gpuobj->im_pramin->size);
+		gpuobj->im_backing_suspend = vmalloc(gpuobj->size);
 		if (!gpuobj->im_backing_suspend) {
 			nouveau_gpuobj_resume(dev);
 			return -ENOMEM;
 		}
 
-		dev_priv->engine.instmem.prepare_access(dev, false);
-		for (i = 0; i < gpuobj->im_pramin->size / 4; i++)
-			gpuobj->im_backing_suspend[i] = nv_ro32(dev, gpuobj, i);
-		dev_priv->engine.instmem.finish_access(dev);
+		for (i = 0; i < gpuobj->size; i += 4)
+			gpuobj->im_backing_suspend[i/4] = nv_ro32(gpuobj, i);
 	}
 
 	return 0;
@@ -1212,10 +857,9 @@ nouveau_gpuobj_resume(struct drm_device *dev)
 		if (!gpuobj->im_backing_suspend)
 			continue;
 
-		dev_priv->engine.instmem.prepare_access(dev, true);
-		for (i = 0; i < gpuobj->im_pramin->size / 4; i++)
-			nv_wo32(dev, gpuobj, i, gpuobj->im_backing_suspend[i]);
-		dev_priv->engine.instmem.finish_access(dev);
+		for (i = 0; i < gpuobj->size; i += 4)
+			nv_wo32(gpuobj, i, gpuobj->im_backing_suspend[i/4]);
+		dev_priv->engine.instmem.flush(dev);
 	}
 
 	nouveau_gpuobj_suspend_cleanup(dev);
@@ -1232,7 +876,6 @@ int nouveau_ioctl_grobj_alloc(struct drm_device *dev, void *data,
 	struct nouveau_channel *chan;
 	int ret;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
 	NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(init->channel, file_priv, chan);
 
 	if (init->handle == ~0)
@@ -1250,25 +893,24 @@ int nouveau_ioctl_grobj_alloc(struct drm_device *dev, void *data,
 		return -EPERM;
 	}
 
-	if (nouveau_gpuobj_ref_find(chan, init->handle, NULL) == 0)
+	if (nouveau_ramht_find(chan, init->handle))
 		return -EEXIST;
 
 	if (!grc->software)
 		ret = nouveau_gpuobj_gr_new(chan, grc->id, &gr);
 	else
 		ret = nouveau_gpuobj_sw_new(chan, grc->id, &gr);
-
 	if (ret) {
 		NV_ERROR(dev, "Error creating object: %d (%d/0x%08x)\n",
 			 ret, init->channel, init->handle);
 		return ret;
 	}
 
-	ret = nouveau_gpuobj_ref_add(dev, chan, init->handle, gr, NULL);
+	ret = nouveau_ramht_insert(chan, init->handle, gr);
+	nouveau_gpuobj_ref(NULL, &gr);
 	if (ret) {
 		NV_ERROR(dev, "Error referencing object: %d (%d/0x%08x)\n",
 			 ret, init->channel, init->handle);
-		nouveau_gpuobj_del(dev, &gr);
 		return ret;
 	}
 
@@ -1279,17 +921,62 @@ int nouveau_ioctl_gpuobj_free(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv)
 {
 	struct drm_nouveau_gpuobj_free *objfree = data;
-	struct nouveau_gpuobj_ref *ref;
+	struct nouveau_gpuobj *gpuobj;
 	struct nouveau_channel *chan;
-	int ret;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
 	NOUVEAU_GET_USER_CHANNEL_WITH_RETURN(objfree->channel, file_priv, chan);
 
-	ret = nouveau_gpuobj_ref_find(chan, objfree->handle, &ref);
-	if (ret)
-		return ret;
-	nouveau_gpuobj_ref_del(dev, &ref);
+	gpuobj = nouveau_ramht_find(chan, objfree->handle);
+	if (!gpuobj)
+		return -ENOENT;
 
+	nouveau_ramht_remove(chan, objfree->handle);
 	return 0;
 }
+
+u32
+nv_ro32(struct nouveau_gpuobj *gpuobj, u32 offset)
+{
+	struct drm_nouveau_private *dev_priv = gpuobj->dev->dev_private;
+	struct drm_device *dev = gpuobj->dev;
+
+	if (gpuobj->pinst == ~0 || !dev_priv->ramin_available) {
+		u64  ptr = gpuobj->vinst + offset;
+		u32 base = ptr >> 16;
+		u32  val;
+
+		spin_lock(&dev_priv->ramin_lock);
+		if (dev_priv->ramin_base != base) {
+			dev_priv->ramin_base = base;
+			nv_wr32(dev, 0x001700, dev_priv->ramin_base);
+		}
+		val = nv_rd32(dev, 0x700000 + (ptr & 0xffff));
+		spin_unlock(&dev_priv->ramin_lock);
+		return val;
+	}
+
+	return nv_ri32(dev, gpuobj->pinst + offset);
+}
+
+void
+nv_wo32(struct nouveau_gpuobj *gpuobj, u32 offset, u32 val)
+{
+	struct drm_nouveau_private *dev_priv = gpuobj->dev->dev_private;
+	struct drm_device *dev = gpuobj->dev;
+
+	if (gpuobj->pinst == ~0 || !dev_priv->ramin_available) {
+		u64  ptr = gpuobj->vinst + offset;
+		u32 base = ptr >> 16;
+
+		spin_lock(&dev_priv->ramin_lock);
+		if (dev_priv->ramin_base != base) {
+			dev_priv->ramin_base = base;
+			nv_wr32(dev, 0x001700, dev_priv->ramin_base);
+		}
+		nv_wr32(dev, 0x700000 + (ptr & 0xffff), val);
+		spin_unlock(&dev_priv->ramin_lock);
+		return;
+	}
+
+	nv_wi32(dev, gpuobj->pinst + offset, val);
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_ramht.c b/drivers/gpu/drm/nouveau/nouveau_ramht.c
new file mode 100644
index 0000000..7f16697
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_ramht.c
@@ -0,0 +1,289 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#include "drmP.h"
+
+#include "nouveau_drv.h"
+#include "nouveau_ramht.h"
+
+static u32
+nouveau_ramht_hash_handle(struct nouveau_channel *chan, u32 handle)
+{
+	struct drm_device *dev = chan->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_ramht *ramht = chan->ramht;
+	u32 hash = 0;
+	int i;
+
+	NV_DEBUG(dev, "ch%d handle=0x%08x\n", chan->id, handle);
+
+	for (i = 32; i > 0; i -= ramht->bits) {
+		hash ^= (handle & ((1 << ramht->bits) - 1));
+		handle >>= ramht->bits;
+	}
+
+	if (dev_priv->card_type < NV_50)
+		hash ^= chan->id << (ramht->bits - 4);
+	hash <<= 3;
+
+	NV_DEBUG(dev, "hash=0x%08x\n", hash);
+	return hash;
+}
+
+static int
+nouveau_ramht_entry_valid(struct drm_device *dev, struct nouveau_gpuobj *ramht,
+			  u32 offset)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 ctx = nv_ro32(ramht, offset + 4);
+
+	if (dev_priv->card_type < NV_40)
+		return ((ctx & NV_RAMHT_CONTEXT_VALID) != 0);
+	return (ctx != 0);
+}
+
+static int
+nouveau_ramht_entry_same_channel(struct nouveau_channel *chan,
+				 struct nouveau_gpuobj *ramht, u32 offset)
+{
+	struct drm_nouveau_private *dev_priv = chan->dev->dev_private;
+	u32 ctx = nv_ro32(ramht, offset + 4);
+
+	if (dev_priv->card_type >= NV_50)
+		return true;
+	else if (dev_priv->card_type >= NV_40)
+		return chan->id ==
+			((ctx >> NV40_RAMHT_CONTEXT_CHANNEL_SHIFT) & 0x1f);
+	else
+		return chan->id ==
+			((ctx >> NV_RAMHT_CONTEXT_CHANNEL_SHIFT) & 0x1f);
+}
+
+int
+nouveau_ramht_insert(struct nouveau_channel *chan, u32 handle,
+		     struct nouveau_gpuobj *gpuobj)
+{
+	struct drm_device *dev = chan->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_instmem_engine *instmem = &dev_priv->engine.instmem;
+	struct nouveau_ramht_entry *entry;
+	struct nouveau_gpuobj *ramht = chan->ramht->gpuobj;
+	unsigned long flags;
+	u32 ctx, co, ho;
+
+	if (nouveau_ramht_find(chan, handle))
+		return -EEXIST;
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return -ENOMEM;
+	entry->channel = chan;
+	entry->gpuobj = NULL;
+	entry->handle = handle;
+	nouveau_gpuobj_ref(gpuobj, &entry->gpuobj);
+
+	if (dev_priv->card_type < NV_40) {
+		ctx = NV_RAMHT_CONTEXT_VALID | (gpuobj->cinst >> 4) |
+		      (chan->id << NV_RAMHT_CONTEXT_CHANNEL_SHIFT) |
+		      (gpuobj->engine << NV_RAMHT_CONTEXT_ENGINE_SHIFT);
+	} else
+	if (dev_priv->card_type < NV_50) {
+		ctx = (gpuobj->cinst >> 4) |
+		      (chan->id << NV40_RAMHT_CONTEXT_CHANNEL_SHIFT) |
+		      (gpuobj->engine << NV40_RAMHT_CONTEXT_ENGINE_SHIFT);
+	} else {
+		if (gpuobj->engine == NVOBJ_ENGINE_DISPLAY) {
+			ctx = (gpuobj->cinst << 10) | 2;
+		} else {
+			ctx = (gpuobj->cinst >> 4) |
+			      ((gpuobj->engine <<
+				NV40_RAMHT_CONTEXT_ENGINE_SHIFT));
+		}
+	}
+
+	spin_lock_irqsave(&chan->ramht->lock, flags);
+	list_add(&entry->head, &chan->ramht->entries);
+
+	co = ho = nouveau_ramht_hash_handle(chan, handle);
+	do {
+		if (!nouveau_ramht_entry_valid(dev, ramht, co)) {
+			NV_DEBUG(dev,
+				 "insert ch%d 0x%08x: h=0x%08x, c=0x%08x\n",
+				 chan->id, co, handle, ctx);
+			nv_wo32(ramht, co + 0, handle);
+			nv_wo32(ramht, co + 4, ctx);
+
+			spin_unlock_irqrestore(&chan->ramht->lock, flags);
+			instmem->flush(dev);
+			return 0;
+		}
+		NV_DEBUG(dev, "collision ch%d 0x%08x: h=0x%08x\n",
+			 chan->id, co, nv_ro32(ramht, co));
+
+		co += 8;
+		if (co >= ramht->size)
+			co = 0;
+	} while (co != ho);
+
+	NV_ERROR(dev, "RAMHT space exhausted. ch=%d\n", chan->id);
+	list_del(&entry->head);
+	spin_unlock_irqrestore(&chan->ramht->lock, flags);
+	kfree(entry);
+	return -ENOMEM;
+}
+
+static void
+nouveau_ramht_remove_locked(struct nouveau_channel *chan, u32 handle)
+{
+	struct drm_device *dev = chan->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_instmem_engine *instmem = &dev_priv->engine.instmem;
+	struct nouveau_gpuobj *ramht = chan->ramht->gpuobj;
+	struct nouveau_ramht_entry *entry, *tmp;
+	u32 co, ho;
+
+	list_for_each_entry_safe(entry, tmp, &chan->ramht->entries, head) {
+		if (entry->channel != chan || entry->handle != handle)
+			continue;
+
+		nouveau_gpuobj_ref(NULL, &entry->gpuobj);
+		list_del(&entry->head);
+		kfree(entry);
+		break;
+	}
+
+	co = ho = nouveau_ramht_hash_handle(chan, handle);
+	do {
+		if (nouveau_ramht_entry_valid(dev, ramht, co) &&
+		    nouveau_ramht_entry_same_channel(chan, ramht, co) &&
+		    (handle == nv_ro32(ramht, co))) {
+			NV_DEBUG(dev,
+				 "remove ch%d 0x%08x: h=0x%08x, c=0x%08x\n",
+				 chan->id, co, handle, nv_ro32(ramht, co + 4));
+			nv_wo32(ramht, co + 0, 0x00000000);
+			nv_wo32(ramht, co + 4, 0x00000000);
+			instmem->flush(dev);
+			return;
+		}
+
+		co += 8;
+		if (co >= ramht->size)
+			co = 0;
+	} while (co != ho);
+
+	NV_ERROR(dev, "RAMHT entry not found. ch=%d, handle=0x%08x\n",
+		 chan->id, handle);
+}
+
+void
+nouveau_ramht_remove(struct nouveau_channel *chan, u32 handle)
+{
+	struct nouveau_ramht *ramht = chan->ramht;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ramht->lock, flags);
+	nouveau_ramht_remove_locked(chan, handle);
+	spin_unlock_irqrestore(&ramht->lock, flags);
+}
+
+struct nouveau_gpuobj *
+nouveau_ramht_find(struct nouveau_channel *chan, u32 handle)
+{
+	struct nouveau_ramht *ramht = chan->ramht;
+	struct nouveau_ramht_entry *entry;
+	struct nouveau_gpuobj *gpuobj = NULL;
+	unsigned long flags;
+
+	if (unlikely(!chan->ramht))
+		return NULL;
+
+	spin_lock_irqsave(&ramht->lock, flags);
+	list_for_each_entry(entry, &chan->ramht->entries, head) {
+		if (entry->channel == chan && entry->handle == handle) {
+			gpuobj = entry->gpuobj;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&ramht->lock, flags);
+
+	return gpuobj;
+}
+
+int
+nouveau_ramht_new(struct drm_device *dev, struct nouveau_gpuobj *gpuobj,
+		  struct nouveau_ramht **pramht)
+{
+	struct nouveau_ramht *ramht;
+
+	ramht = kzalloc(sizeof(*ramht), GFP_KERNEL);
+	if (!ramht)
+		return -ENOMEM;
+
+	ramht->dev = dev;
+	kref_init(&ramht->refcount);
+	ramht->bits = drm_order(gpuobj->size / 8);
+	INIT_LIST_HEAD(&ramht->entries);
+	spin_lock_init(&ramht->lock);
+	nouveau_gpuobj_ref(gpuobj, &ramht->gpuobj);
+
+	*pramht = ramht;
+	return 0;
+}
+
+static void
+nouveau_ramht_del(struct kref *ref)
+{
+	struct nouveau_ramht *ramht =
+		container_of(ref, struct nouveau_ramht, refcount);
+
+	nouveau_gpuobj_ref(NULL, &ramht->gpuobj);
+	kfree(ramht);
+}
+
+void
+nouveau_ramht_ref(struct nouveau_ramht *ref, struct nouveau_ramht **ptr,
+		  struct nouveau_channel *chan)
+{
+	struct nouveau_ramht_entry *entry, *tmp;
+	struct nouveau_ramht *ramht;
+	unsigned long flags;
+
+	if (ref)
+		kref_get(&ref->refcount);
+
+	ramht = *ptr;
+	if (ramht) {
+		spin_lock_irqsave(&ramht->lock, flags);
+		list_for_each_entry_safe(entry, tmp, &ramht->entries, head) {
+			if (entry->channel != chan)
+				continue;
+
+			nouveau_ramht_remove_locked(chan, entry->handle);
+		}
+		spin_unlock_irqrestore(&ramht->lock, flags);
+
+		kref_put(&ramht->refcount, nouveau_ramht_del);
+	}
+	*ptr = ref;
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_ramht.h b/drivers/gpu/drm/nouveau/nouveau_ramht.h
new file mode 100644
index 0000000..b79cb5e
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_ramht.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#ifndef __NOUVEAU_RAMHT_H__
+#define __NOUVEAU_RAMHT_H__
+
+struct nouveau_ramht_entry {
+	struct list_head head;
+	struct nouveau_channel *channel;
+	struct nouveau_gpuobj *gpuobj;
+	u32 handle;
+};
+
+struct nouveau_ramht {
+	struct drm_device *dev;
+	struct kref refcount;
+	spinlock_t lock;
+	struct nouveau_gpuobj *gpuobj;
+	struct list_head entries;
+	int bits;
+};
+
+extern int  nouveau_ramht_new(struct drm_device *, struct nouveau_gpuobj *,
+			      struct nouveau_ramht **);
+extern void nouveau_ramht_ref(struct nouveau_ramht *, struct nouveau_ramht **,
+			      struct nouveau_channel *unref_channel);
+
+extern int  nouveau_ramht_insert(struct nouveau_channel *, u32 handle,
+				 struct nouveau_gpuobj *);
+extern void nouveau_ramht_remove(struct nouveau_channel *, u32 handle);
+extern struct nouveau_gpuobj *
+nouveau_ramht_find(struct nouveau_channel *chan, u32 handle);
+
+#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_reg.h b/drivers/gpu/drm/nouveau/nouveau_reg.h
index 6ca80a3..1b42541 100644
--- a/drivers/gpu/drm/nouveau/nouveau_reg.h
+++ b/drivers/gpu/drm/nouveau/nouveau_reg.h
@@ -1,19 +1,64 @@
 
+#define NV04_PFB_BOOT_0						0x00100000
+#	define NV04_PFB_BOOT_0_RAM_AMOUNT			0x00000003
+#	define NV04_PFB_BOOT_0_RAM_AMOUNT_32MB			0x00000000
+#	define NV04_PFB_BOOT_0_RAM_AMOUNT_4MB			0x00000001
+#	define NV04_PFB_BOOT_0_RAM_AMOUNT_8MB			0x00000002
+#	define NV04_PFB_BOOT_0_RAM_AMOUNT_16MB			0x00000003
+#	define NV04_PFB_BOOT_0_RAM_WIDTH_128			0x00000004
+#	define NV04_PFB_BOOT_0_RAM_TYPE				0x00000028
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SGRAM_8MBIT		0x00000000
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SGRAM_16MBIT		0x00000008
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SGRAM_16MBIT_4BANK	0x00000010
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SDRAM_16MBIT		0x00000018
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SDRAM_64MBIT		0x00000020
+#	define NV04_PFB_BOOT_0_RAM_TYPE_SDRAM_64MBITX16		0x00000028
+#	define NV04_PFB_BOOT_0_UMA_ENABLE			0x00000100
+#	define NV04_PFB_BOOT_0_UMA_SIZE				0x0000f000
+#define NV04_PFB_DEBUG_0					0x00100080
+#	define NV04_PFB_DEBUG_0_PAGE_MODE			0x00000001
+#	define NV04_PFB_DEBUG_0_REFRESH_OFF			0x00000010
+#	define NV04_PFB_DEBUG_0_REFRESH_COUNTX64		0x00003f00
+#	define NV04_PFB_DEBUG_0_REFRESH_SLOW_CLK		0x00004000
+#	define NV04_PFB_DEBUG_0_SAFE_MODE			0x00008000
+#	define NV04_PFB_DEBUG_0_ALOM_ENABLE			0x00010000
+#	define NV04_PFB_DEBUG_0_CASOE				0x00100000
+#	define NV04_PFB_DEBUG_0_CKE_INVERT			0x10000000
+#	define NV04_PFB_DEBUG_0_REFINC				0x20000000
+#	define NV04_PFB_DEBUG_0_SAVE_POWER_OFF			0x40000000
+#define NV04_PFB_CFG0						0x00100200
+#	define NV04_PFB_CFG0_SCRAMBLE				0x20000000
+#define NV04_PFB_CFG1						0x00100204
+#define NV04_PFB_FIFO_DATA					0x0010020c
+#	define NV10_PFB_FIFO_DATA_RAM_AMOUNT_MB_MASK		0xfff00000
+#	define NV10_PFB_FIFO_DATA_RAM_AMOUNT_MB_SHIFT		20
+#define NV10_PFB_REFCTRL					0x00100210
+#	define NV10_PFB_REFCTRL_VALID_1				(1 << 31)
+#define NV04_PFB_PAD						0x0010021c
+#	define NV04_PFB_PAD_CKE_NORMAL				(1 << 0)
+#define NV10_PFB_TILE(i)                              (0x00100240 + (i*16))
+#define NV10_PFB_TILE__SIZE					8
+#define NV10_PFB_TLIMIT(i)                            (0x00100244 + (i*16))
+#define NV10_PFB_TSIZE(i)                             (0x00100248 + (i*16))
+#define NV10_PFB_TSTATUS(i)                           (0x0010024c + (i*16))
+#define NV04_PFB_REF						0x001002d0
+#	define NV04_PFB_REF_CMD_REFRESH				(1 << 0)
+#define NV04_PFB_PRE						0x001002d4
+#	define NV04_PFB_PRE_CMD_PRECHARGE			(1 << 0)
+#define NV10_PFB_CLOSE_PAGE2					0x0010033c
+#define NV04_PFB_SCRAMBLE(i)                         (0x00100400 + 4 * (i))
+#define NV40_PFB_TILE(i)                              (0x00100600 + (i*16))
+#define NV40_PFB_TILE__SIZE_0					12
+#define NV40_PFB_TILE__SIZE_1					15
+#define NV40_PFB_TLIMIT(i)                            (0x00100604 + (i*16))
+#define NV40_PFB_TSIZE(i)                             (0x00100608 + (i*16))
+#define NV40_PFB_TSTATUS(i)                           (0x0010060c + (i*16))
+#define NV40_PFB_UNK_800					0x00100800
 
-#define NV03_BOOT_0                                        0x00100000
-#    define NV03_BOOT_0_RAM_AMOUNT                         0x00000003
-#    define NV03_BOOT_0_RAM_AMOUNT_8MB                     0x00000000
-#    define NV03_BOOT_0_RAM_AMOUNT_2MB                     0x00000001
-#    define NV03_BOOT_0_RAM_AMOUNT_4MB                     0x00000002
-#    define NV03_BOOT_0_RAM_AMOUNT_8MB_SDRAM               0x00000003
-#    define NV04_BOOT_0_RAM_AMOUNT_32MB                    0x00000000
-#    define NV04_BOOT_0_RAM_AMOUNT_4MB                     0x00000001
-#    define NV04_BOOT_0_RAM_AMOUNT_8MB                     0x00000002
-#    define NV04_BOOT_0_RAM_AMOUNT_16MB                    0x00000003
-
-#define NV04_FIFO_DATA                                     0x0010020c
-#    define NV10_FIFO_DATA_RAM_AMOUNT_MB_MASK              0xfff00000
-#    define NV10_FIFO_DATA_RAM_AMOUNT_MB_SHIFT             20
+#define NV_PEXTDEV_BOOT_0					0x00101000
+#define NV_PEXTDEV_BOOT_0_RAMCFG				0x0000003c
+#	define NV_PEXTDEV_BOOT_0_STRAP_FP_IFACE_12BIT		(8 << 12)
+#define NV_PEXTDEV_BOOT_3					0x0010100c
 
 #define NV_RAMIN                                           0x00700000
 
@@ -131,23 +176,6 @@
 #define NV04_PTIMER_TIME_1                                 0x00009410
 #define NV04_PTIMER_ALARM_0                                0x00009420
 
-#define NV04_PFB_CFG0                                      0x00100200
-#define NV04_PFB_CFG1                                      0x00100204
-#define NV40_PFB_020C                                      0x0010020C
-#define NV10_PFB_TILE(i)                                   (0x00100240 + (i*16))
-#define NV10_PFB_TILE__SIZE                                8
-#define NV10_PFB_TLIMIT(i)                                 (0x00100244 + (i*16))
-#define NV10_PFB_TSIZE(i)                                  (0x00100248 + (i*16))
-#define NV10_PFB_TSTATUS(i)                                (0x0010024C + (i*16))
-#define NV10_PFB_CLOSE_PAGE2                               0x0010033C
-#define NV40_PFB_TILE(i)                                   (0x00100600 + (i*16))
-#define NV40_PFB_TILE__SIZE_0                              12
-#define NV40_PFB_TILE__SIZE_1                              15
-#define NV40_PFB_TLIMIT(i)                                 (0x00100604 + (i*16))
-#define NV40_PFB_TSIZE(i)                                  (0x00100608 + (i*16))
-#define NV40_PFB_TSTATUS(i)                                (0x0010060C + (i*16))
-#define NV40_PFB_UNK_800					0x00100800
-
 #define NV04_PGRAPH_DEBUG_0                                0x00400080
 #define NV04_PGRAPH_DEBUG_1                                0x00400084
 #define NV04_PGRAPH_DEBUG_2                                0x00400088
@@ -192,28 +220,21 @@
 #    define NV_PGRAPH_INTR_ERROR                              (1<<20)
 #define NV10_PGRAPH_CTX_CONTROL                            0x00400144
 #define NV10_PGRAPH_CTX_USER                               0x00400148
-#define NV10_PGRAPH_CTX_SWITCH1                            0x0040014C
-#define NV10_PGRAPH_CTX_SWITCH2                            0x00400150
-#define NV10_PGRAPH_CTX_SWITCH3                            0x00400154
-#define NV10_PGRAPH_CTX_SWITCH4                            0x00400158
-#define NV10_PGRAPH_CTX_SWITCH5                            0x0040015C
+#define NV10_PGRAPH_CTX_SWITCH(i)                         (0x0040014C + 0x4*(i))
 #define NV04_PGRAPH_CTX_SWITCH1                            0x00400160
-#define NV10_PGRAPH_CTX_CACHE1                             0x00400160
+#define NV10_PGRAPH_CTX_CACHE(i, j)                       (0x00400160	\
+							   + 0x4*(i) + 0x20*(j))
 #define NV04_PGRAPH_CTX_SWITCH2                            0x00400164
 #define NV04_PGRAPH_CTX_SWITCH3                            0x00400168
 #define NV04_PGRAPH_CTX_SWITCH4                            0x0040016C
 #define NV04_PGRAPH_CTX_CONTROL                            0x00400170
 #define NV04_PGRAPH_CTX_USER                               0x00400174
 #define NV04_PGRAPH_CTX_CACHE1                             0x00400180
-#define NV10_PGRAPH_CTX_CACHE2                             0x00400180
 #define NV03_PGRAPH_CTX_CONTROL                            0x00400190
 #define NV03_PGRAPH_CTX_USER                               0x00400194
 #define NV04_PGRAPH_CTX_CACHE2                             0x004001A0
-#define NV10_PGRAPH_CTX_CACHE3                             0x004001A0
 #define NV04_PGRAPH_CTX_CACHE3                             0x004001C0
-#define NV10_PGRAPH_CTX_CACHE4                             0x004001C0
 #define NV04_PGRAPH_CTX_CACHE4                             0x004001E0
-#define NV10_PGRAPH_CTX_CACHE5                             0x004001E0
 #define NV40_PGRAPH_CTXCTL_0304                            0x00400304
 #define NV40_PGRAPH_CTXCTL_0304_XFER_CTX                   0x00000001
 #define NV40_PGRAPH_CTXCTL_UCODE_STAT                      0x00400308
@@ -328,9 +349,12 @@
 #define NV04_PGRAPH_FFINTFC_ST2                            0x00400754
 #define NV10_PGRAPH_RDI_DATA                               0x00400754
 #define NV04_PGRAPH_DMA_PITCH                              0x00400760
-#define NV10_PGRAPH_FFINTFC_ST2                            0x00400764
+#define NV10_PGRAPH_FFINTFC_FIFO_PTR                       0x00400760
 #define NV04_PGRAPH_DVD_COLORFMT                           0x00400764
+#define NV10_PGRAPH_FFINTFC_ST2                            0x00400764
 #define NV04_PGRAPH_SCALED_FORMAT                          0x00400768
+#define NV10_PGRAPH_FFINTFC_ST2_DL                         0x00400768
+#define NV10_PGRAPH_FFINTFC_ST2_DH                         0x0040076c
 #define NV10_PGRAPH_DMA_PITCH                              0x00400770
 #define NV10_PGRAPH_DVD_COLORFMT                           0x00400774
 #define NV10_PGRAPH_SCALED_FORMAT                          0x00400778
@@ -527,6 +551,8 @@
 #define NV10_PFIFO_CACHE1_DMA_SUBROUTINE                   0x0000324C
 #define NV03_PFIFO_CACHE1_PULL0                            0x00003240
 #define NV04_PFIFO_CACHE1_PULL0                            0x00003250
+#    define NV04_PFIFO_CACHE1_PULL0_HASH_FAILED            0x00000010
+#    define NV04_PFIFO_CACHE1_PULL0_HASH_BUSY              0x00001000
 #define NV03_PFIFO_CACHE1_PULL1                            0x00003250
 #define NV04_PFIFO_CACHE1_PULL1                            0x00003254
 #define NV04_PFIFO_CACHE1_HASH                             0x00003258
@@ -761,15 +787,12 @@
 #define NV50_PDISPLAY_DAC_MODE_CTRL_C(i)                (0x00610b5c + (i) * 0x8)
 #define NV50_PDISPLAY_SOR_MODE_CTRL_P(i)                (0x00610b70 + (i) * 0x8)
 #define NV50_PDISPLAY_SOR_MODE_CTRL_C(i)                (0x00610b74 + (i) * 0x8)
+#define NV50_PDISPLAY_EXT_MODE_CTRL_P(i)                (0x00610b80 + (i) * 0x8)
+#define NV50_PDISPLAY_EXT_MODE_CTRL_C(i)                (0x00610b84 + (i) * 0x8)
 #define NV50_PDISPLAY_DAC_MODE_CTRL2_P(i)               (0x00610bdc + (i) * 0x8)
 #define NV50_PDISPLAY_DAC_MODE_CTRL2_C(i)               (0x00610be0 + (i) * 0x8)
-
 #define NV90_PDISPLAY_SOR_MODE_CTRL_P(i)                (0x00610794 + (i) * 0x8)
 #define NV90_PDISPLAY_SOR_MODE_CTRL_C(i)                (0x00610798 + (i) * 0x8)
-#define NV90_PDISPLAY_DAC_MODE_CTRL_P(i)                (0x00610b58 + (i) * 0x8)
-#define NV90_PDISPLAY_DAC_MODE_CTRL_C(i)                (0x00610b5c + (i) * 0x8)
-#define NV90_PDISPLAY_DAC_MODE_CTRL2_P(i)               (0x00610b80 + (i) * 0x8)
-#define NV90_PDISPLAY_DAC_MODE_CTRL2_C(i)               (0x00610b84 + (i) * 0x8)
 
 #define NV50_PDISPLAY_CRTC_CLK                                       0x00614000
 #define NV50_PDISPLAY_CRTC_CLK_CTRL1(i)                 ((i) * 0x800 + 0x614100)
@@ -814,6 +837,7 @@
 #define NV50_PDISPLAY_SOR_BACKLIGHT_ENABLE                           0x80000000
 #define NV50_PDISPLAY_SOR_BACKLIGHT_LEVEL                            0x00000fff
 #define NV50_SOR_DP_CTRL(i,l)            (0x0061c10c + (i) * 0x800 + (l) * 0x80)
+#define NV50_SOR_DP_CTRL_ENABLED                                     0x00000001
 #define NV50_SOR_DP_CTRL_ENHANCED_FRAME_ENABLED                      0x00004000
 #define NV50_SOR_DP_CTRL_LANE_MASK                                   0x001f0000
 #define NV50_SOR_DP_CTRL_LANE_0_ENABLED                              0x00010000
diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
index 1d6ee8b..7f028fe 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
@@ -97,7 +97,6 @@ nouveau_sgdma_bind(struct ttm_backend *be, struct ttm_mem_reg *mem)
 
 	NV_DEBUG(dev, "pg=0x%lx\n", mem->mm_node->start);
 
-	dev_priv->engine.instmem.prepare_access(nvbe->dev, true);
 	pte = nouveau_sgdma_pte(nvbe->dev, mem->mm_node->start << PAGE_SHIFT);
 	nvbe->pte_start = pte;
 	for (i = 0; i < nvbe->nr_pages; i++) {
@@ -106,34 +105,23 @@ nouveau_sgdma_bind(struct ttm_backend *be, struct ttm_mem_reg *mem)
 		uint32_t offset_h = upper_32_bits(dma_offset);
 
 		for (j = 0; j < PAGE_SIZE / NV_CTXDMA_PAGE_SIZE; j++) {
-			if (dev_priv->card_type < NV_50)
-				nv_wo32(dev, gpuobj, pte++, offset_l | 3);
-			else {
-				nv_wo32(dev, gpuobj, pte++, offset_l | 0x21);
-				nv_wo32(dev, gpuobj, pte++, offset_h & 0xff);
+			if (dev_priv->card_type < NV_50) {
+				nv_wo32(gpuobj, (pte * 4) + 0, offset_l | 3);
+				pte += 1;
+			} else {
+				nv_wo32(gpuobj, (pte * 4) + 0, offset_l | 0x21);
+				nv_wo32(gpuobj, (pte * 4) + 4, offset_h & 0xff);
+				pte += 2;
 			}
 
 			dma_offset += NV_CTXDMA_PAGE_SIZE;
 		}
 	}
-	dev_priv->engine.instmem.finish_access(nvbe->dev);
+	dev_priv->engine.instmem.flush(nvbe->dev);
 
 	if (dev_priv->card_type == NV_50) {
-		nv_wr32(dev, 0x100c80, 0x00050001);
-		if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-			NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-			NV_ERROR(dev, "0x100c80 = 0x%08x\n",
-						nv_rd32(dev, 0x100c80));
-			return -EBUSY;
-		}
-
-		nv_wr32(dev, 0x100c80, 0x00000001);
-		if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-			NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-			NV_ERROR(dev, "0x100c80 = 0x%08x\n",
-						nv_rd32(dev, 0x100c80));
-			return -EBUSY;
-		}
+		nv50_vm_flush(dev, 5); /* PGRAPH */
+		nv50_vm_flush(dev, 0); /* PFIFO */
 	}
 
 	nvbe->bound = true;
@@ -154,40 +142,28 @@ nouveau_sgdma_unbind(struct ttm_backend *be)
 	if (!nvbe->bound)
 		return 0;
 
-	dev_priv->engine.instmem.prepare_access(nvbe->dev, true);
 	pte = nvbe->pte_start;
 	for (i = 0; i < nvbe->nr_pages; i++) {
 		dma_addr_t dma_offset = dev_priv->gart_info.sg_dummy_bus;
 
 		for (j = 0; j < PAGE_SIZE / NV_CTXDMA_PAGE_SIZE; j++) {
-			if (dev_priv->card_type < NV_50)
-				nv_wo32(dev, gpuobj, pte++, dma_offset | 3);
-			else {
-				nv_wo32(dev, gpuobj, pte++, dma_offset | 0x21);
-				nv_wo32(dev, gpuobj, pte++, 0x00000000);
+			if (dev_priv->card_type < NV_50) {
+				nv_wo32(gpuobj, (pte * 4) + 0, dma_offset | 3);
+				pte += 1;
+			} else {
+				nv_wo32(gpuobj, (pte * 4) + 0, 0x00000000);
+				nv_wo32(gpuobj, (pte * 4) + 4, 0x00000000);
+				pte += 2;
 			}
 
 			dma_offset += NV_CTXDMA_PAGE_SIZE;
 		}
 	}
-	dev_priv->engine.instmem.finish_access(nvbe->dev);
+	dev_priv->engine.instmem.flush(nvbe->dev);
 
 	if (dev_priv->card_type == NV_50) {
-		nv_wr32(dev, 0x100c80, 0x00050001);
-		if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-			NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-			NV_ERROR(dev, "0x100c80 = 0x%08x\n",
-						nv_rd32(dev, 0x100c80));
-			return -EBUSY;
-		}
-
-		nv_wr32(dev, 0x100c80, 0x00000001);
-		if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-			NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-			NV_ERROR(dev, "0x100c80 = 0x%08x\n",
-						nv_rd32(dev, 0x100c80));
-			return -EBUSY;
-		}
+		nv50_vm_flush(dev, 5);
+		nv50_vm_flush(dev, 0);
 	}
 
 	nvbe->bound = false;
@@ -242,6 +218,7 @@ int
 nouveau_sgdma_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct pci_dev *pdev = dev->pdev;
 	struct nouveau_gpuobj *gpuobj = NULL;
 	uint32_t aper_size, obj_size;
 	int i, ret;
@@ -257,7 +234,6 @@ nouveau_sgdma_init(struct drm_device *dev)
 	}
 
 	ret = nouveau_gpuobj_new(dev, NULL, obj_size, 16,
-				      NVOBJ_FLAG_ALLOW_NO_REFS |
 				      NVOBJ_FLAG_ZERO_ALLOC |
 				      NVOBJ_FLAG_ZERO_FREE, &gpuobj);
 	if (ret) {
@@ -266,35 +242,48 @@ nouveau_sgdma_init(struct drm_device *dev)
 	}
 
 	dev_priv->gart_info.sg_dummy_page =
-		alloc_page(GFP_KERNEL|__GFP_DMA32);
+		alloc_page(GFP_KERNEL|__GFP_DMA32|__GFP_ZERO);
+	if (!dev_priv->gart_info.sg_dummy_page) {
+		nouveau_gpuobj_ref(NULL, &gpuobj);
+		return -ENOMEM;
+	}
+
 	set_bit(PG_locked, &dev_priv->gart_info.sg_dummy_page->flags);
 	dev_priv->gart_info.sg_dummy_bus =
-		pci_map_page(dev->pdev, dev_priv->gart_info.sg_dummy_page, 0,
+		pci_map_page(pdev, dev_priv->gart_info.sg_dummy_page, 0,
 			     PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	if (pci_dma_mapping_error(pdev, dev_priv->gart_info.sg_dummy_bus)) {
+		nouveau_gpuobj_ref(NULL, &gpuobj);
+		return -EFAULT;
+	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	if (dev_priv->card_type < NV_50) {
+		/* special case, allocated from global instmem heap so
+		 * cinst is invalid, we use it on all channels though so
+		 * cinst needs to be valid, set it the same as pinst
+		 */
+		gpuobj->cinst = gpuobj->pinst;
+
 		/* Maybe use NV_DMA_TARGET_AGP for PCIE? NVIDIA do this, and
 		 * confirmed to work on c51.  Perhaps means NV_DMA_TARGET_PCIE
 		 * on those cards? */
-		nv_wo32(dev, gpuobj, 0, NV_CLASS_DMA_IN_MEMORY |
-				       (1 << 12) /* PT present */ |
-				       (0 << 13) /* PT *not* linear */ |
-				       (NV_DMA_ACCESS_RW  << 14) |
-				       (NV_DMA_TARGET_PCI << 16));
-		nv_wo32(dev, gpuobj, 1, aper_size - 1);
+		nv_wo32(gpuobj, 0, NV_CLASS_DMA_IN_MEMORY |
+				   (1 << 12) /* PT present */ |
+				   (0 << 13) /* PT *not* linear */ |
+				   (NV_DMA_ACCESS_RW  << 14) |
+				   (NV_DMA_TARGET_PCI << 16));
+		nv_wo32(gpuobj, 4, aper_size - 1);
 		for (i = 2; i < 2 + (aper_size >> 12); i++) {
-			nv_wo32(dev, gpuobj, i,
-				    dev_priv->gart_info.sg_dummy_bus | 3);
+			nv_wo32(gpuobj, i * 4,
+				dev_priv->gart_info.sg_dummy_bus | 3);
 		}
 	} else {
 		for (i = 0; i < obj_size; i += 8) {
-			nv_wo32(dev, gpuobj, (i+0)/4,
-				    dev_priv->gart_info.sg_dummy_bus | 0x21);
-			nv_wo32(dev, gpuobj, (i+4)/4, 0);
+			nv_wo32(gpuobj, i + 0, 0x00000000);
+			nv_wo32(gpuobj, i + 4, 0x00000000);
 		}
 	}
-	dev_priv->engine.instmem.finish_access(dev);
+	dev_priv->engine.instmem.flush(dev);
 
 	dev_priv->gart_info.type      = NOUVEAU_GART_SGDMA;
 	dev_priv->gart_info.aper_base = 0;
@@ -317,7 +306,7 @@ nouveau_sgdma_takedown(struct drm_device *dev)
 		dev_priv->gart_info.sg_dummy_bus = 0;
 	}
 
-	nouveau_gpuobj_del(dev, &dev_priv->gart_info.sg_ctxdma);
+	nouveau_gpuobj_ref(NULL, &dev_priv->gart_info.sg_ctxdma);
 }
 
 int
@@ -325,14 +314,11 @@ nouveau_sgdma_get_page(struct drm_device *dev, uint32_t offset, uint32_t *page)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_gpuobj *gpuobj = dev_priv->gart_info.sg_ctxdma;
-	struct nouveau_instmem_engine *instmem = &dev_priv->engine.instmem;
 	int pte;
 
-	pte = (offset >> NV_CTXDMA_PAGE_SHIFT);
+	pte = (offset >> NV_CTXDMA_PAGE_SHIFT) << 2;
 	if (dev_priv->card_type < NV_50) {
-		instmem->prepare_access(dev, false);
-		*page = nv_ro32(dev, gpuobj, (pte + 2)) & ~NV_CTXDMA_PAGE_MASK;
-		instmem->finish_access(dev);
+		*page = nv_ro32(gpuobj, (pte + 8)) & ~NV_CTXDMA_PAGE_MASK;
 		return 0;
 	}
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_state.c b/drivers/gpu/drm/nouveau/nouveau_state.c
index b02a231..be85960 100644
--- a/drivers/gpu/drm/nouveau/nouveau_state.c
+++ b/drivers/gpu/drm/nouveau/nouveau_state.c
@@ -35,9 +35,11 @@
 #include "nouveau_drv.h"
 #include "nouveau_drm.h"
 #include "nouveau_fbcon.h"
+#include "nouveau_ramht.h"
 #include "nv50_display.h"
 
 static void nouveau_stub_takedown(struct drm_device *dev) {}
+static int nouveau_stub_init(struct drm_device *dev) { return 0; }
 
 static int nouveau_init_engine_ptrs(struct drm_device *dev)
 {
@@ -54,8 +56,7 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv04_instmem_clear;
 		engine->instmem.bind		= nv04_instmem_bind;
 		engine->instmem.unbind		= nv04_instmem_unbind;
-		engine->instmem.prepare_access	= nv04_instmem_prepare_access;
-		engine->instmem.finish_access	= nv04_instmem_finish_access;
+		engine->instmem.flush		= nv04_instmem_flush;
 		engine->mc.init			= nv04_mc_init;
 		engine->mc.takedown		= nv04_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
@@ -78,13 +79,22 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.disable		= nv04_fifo_disable;
 		engine->fifo.enable		= nv04_fifo_enable;
 		engine->fifo.reassign		= nv04_fifo_reassign;
-		engine->fifo.cache_flush	= nv04_fifo_cache_flush;
 		engine->fifo.cache_pull		= nv04_fifo_cache_pull;
 		engine->fifo.channel_id		= nv04_fifo_channel_id;
 		engine->fifo.create_context	= nv04_fifo_create_context;
 		engine->fifo.destroy_context	= nv04_fifo_destroy_context;
 		engine->fifo.load_context	= nv04_fifo_load_context;
 		engine->fifo.unload_context	= nv04_fifo_unload_context;
+		engine->display.early_init	= nv04_display_early_init;
+		engine->display.late_takedown	= nv04_display_late_takedown;
+		engine->display.create		= nv04_display_create;
+		engine->display.init		= nv04_display_init;
+		engine->display.destroy		= nv04_display_destroy;
+		engine->gpio.init		= nouveau_stub_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= NULL;
+		engine->gpio.set		= NULL;
+		engine->gpio.irq_enable		= NULL;
 		break;
 	case 0x10:
 		engine->instmem.init		= nv04_instmem_init;
@@ -95,8 +105,7 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv04_instmem_clear;
 		engine->instmem.bind		= nv04_instmem_bind;
 		engine->instmem.unbind		= nv04_instmem_unbind;
-		engine->instmem.prepare_access	= nv04_instmem_prepare_access;
-		engine->instmem.finish_access	= nv04_instmem_finish_access;
+		engine->instmem.flush		= nv04_instmem_flush;
 		engine->mc.init			= nv04_mc_init;
 		engine->mc.takedown		= nv04_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
@@ -121,13 +130,22 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.disable		= nv04_fifo_disable;
 		engine->fifo.enable		= nv04_fifo_enable;
 		engine->fifo.reassign		= nv04_fifo_reassign;
-		engine->fifo.cache_flush	= nv04_fifo_cache_flush;
 		engine->fifo.cache_pull		= nv04_fifo_cache_pull;
 		engine->fifo.channel_id		= nv10_fifo_channel_id;
 		engine->fifo.create_context	= nv10_fifo_create_context;
 		engine->fifo.destroy_context	= nv10_fifo_destroy_context;
 		engine->fifo.load_context	= nv10_fifo_load_context;
 		engine->fifo.unload_context	= nv10_fifo_unload_context;
+		engine->display.early_init	= nv04_display_early_init;
+		engine->display.late_takedown	= nv04_display_late_takedown;
+		engine->display.create		= nv04_display_create;
+		engine->display.init		= nv04_display_init;
+		engine->display.destroy		= nv04_display_destroy;
+		engine->gpio.init		= nouveau_stub_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv10_gpio_get;
+		engine->gpio.set		= nv10_gpio_set;
+		engine->gpio.irq_enable		= NULL;
 		break;
 	case 0x20:
 		engine->instmem.init		= nv04_instmem_init;
@@ -138,8 +156,7 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv04_instmem_clear;
 		engine->instmem.bind		= nv04_instmem_bind;
 		engine->instmem.unbind		= nv04_instmem_unbind;
-		engine->instmem.prepare_access	= nv04_instmem_prepare_access;
-		engine->instmem.finish_access	= nv04_instmem_finish_access;
+		engine->instmem.flush		= nv04_instmem_flush;
 		engine->mc.init			= nv04_mc_init;
 		engine->mc.takedown		= nv04_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
@@ -164,13 +181,22 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.disable		= nv04_fifo_disable;
 		engine->fifo.enable		= nv04_fifo_enable;
 		engine->fifo.reassign		= nv04_fifo_reassign;
-		engine->fifo.cache_flush	= nv04_fifo_cache_flush;
 		engine->fifo.cache_pull		= nv04_fifo_cache_pull;
 		engine->fifo.channel_id		= nv10_fifo_channel_id;
 		engine->fifo.create_context	= nv10_fifo_create_context;
 		engine->fifo.destroy_context	= nv10_fifo_destroy_context;
 		engine->fifo.load_context	= nv10_fifo_load_context;
 		engine->fifo.unload_context	= nv10_fifo_unload_context;
+		engine->display.early_init	= nv04_display_early_init;
+		engine->display.late_takedown	= nv04_display_late_takedown;
+		engine->display.create		= nv04_display_create;
+		engine->display.init		= nv04_display_init;
+		engine->display.destroy		= nv04_display_destroy;
+		engine->gpio.init		= nouveau_stub_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv10_gpio_get;
+		engine->gpio.set		= nv10_gpio_set;
+		engine->gpio.irq_enable		= NULL;
 		break;
 	case 0x30:
 		engine->instmem.init		= nv04_instmem_init;
@@ -181,15 +207,14 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv04_instmem_clear;
 		engine->instmem.bind		= nv04_instmem_bind;
 		engine->instmem.unbind		= nv04_instmem_unbind;
-		engine->instmem.prepare_access	= nv04_instmem_prepare_access;
-		engine->instmem.finish_access	= nv04_instmem_finish_access;
+		engine->instmem.flush		= nv04_instmem_flush;
 		engine->mc.init			= nv04_mc_init;
 		engine->mc.takedown		= nv04_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
 		engine->timer.read		= nv04_timer_read;
 		engine->timer.takedown		= nv04_timer_takedown;
-		engine->fb.init			= nv10_fb_init;
-		engine->fb.takedown		= nv10_fb_takedown;
+		engine->fb.init			= nv30_fb_init;
+		engine->fb.takedown		= nv30_fb_takedown;
 		engine->fb.set_region_tiling	= nv10_fb_set_region_tiling;
 		engine->graph.grclass		= nv30_graph_grclass;
 		engine->graph.init		= nv30_graph_init;
@@ -207,13 +232,22 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.disable		= nv04_fifo_disable;
 		engine->fifo.enable		= nv04_fifo_enable;
 		engine->fifo.reassign		= nv04_fifo_reassign;
-		engine->fifo.cache_flush	= nv04_fifo_cache_flush;
 		engine->fifo.cache_pull		= nv04_fifo_cache_pull;
 		engine->fifo.channel_id		= nv10_fifo_channel_id;
 		engine->fifo.create_context	= nv10_fifo_create_context;
 		engine->fifo.destroy_context	= nv10_fifo_destroy_context;
 		engine->fifo.load_context	= nv10_fifo_load_context;
 		engine->fifo.unload_context	= nv10_fifo_unload_context;
+		engine->display.early_init	= nv04_display_early_init;
+		engine->display.late_takedown	= nv04_display_late_takedown;
+		engine->display.create		= nv04_display_create;
+		engine->display.init		= nv04_display_init;
+		engine->display.destroy		= nv04_display_destroy;
+		engine->gpio.init		= nouveau_stub_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv10_gpio_get;
+		engine->gpio.set		= nv10_gpio_set;
+		engine->gpio.irq_enable		= NULL;
 		break;
 	case 0x40:
 	case 0x60:
@@ -225,8 +259,7 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv04_instmem_clear;
 		engine->instmem.bind		= nv04_instmem_bind;
 		engine->instmem.unbind		= nv04_instmem_unbind;
-		engine->instmem.prepare_access	= nv04_instmem_prepare_access;
-		engine->instmem.finish_access	= nv04_instmem_finish_access;
+		engine->instmem.flush		= nv04_instmem_flush;
 		engine->mc.init			= nv40_mc_init;
 		engine->mc.takedown		= nv40_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
@@ -251,13 +284,22 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.disable		= nv04_fifo_disable;
 		engine->fifo.enable		= nv04_fifo_enable;
 		engine->fifo.reassign		= nv04_fifo_reassign;
-		engine->fifo.cache_flush	= nv04_fifo_cache_flush;
 		engine->fifo.cache_pull		= nv04_fifo_cache_pull;
 		engine->fifo.channel_id		= nv10_fifo_channel_id;
 		engine->fifo.create_context	= nv40_fifo_create_context;
 		engine->fifo.destroy_context	= nv40_fifo_destroy_context;
 		engine->fifo.load_context	= nv40_fifo_load_context;
 		engine->fifo.unload_context	= nv40_fifo_unload_context;
+		engine->display.early_init	= nv04_display_early_init;
+		engine->display.late_takedown	= nv04_display_late_takedown;
+		engine->display.create		= nv04_display_create;
+		engine->display.init		= nv04_display_init;
+		engine->display.destroy		= nv04_display_destroy;
+		engine->gpio.init		= nouveau_stub_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv10_gpio_get;
+		engine->gpio.set		= nv10_gpio_set;
+		engine->gpio.irq_enable		= NULL;
 		break;
 	case 0x50:
 	case 0x80: /* gotta love NVIDIA's consistency.. */
@@ -271,8 +313,10 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->instmem.clear		= nv50_instmem_clear;
 		engine->instmem.bind		= nv50_instmem_bind;
 		engine->instmem.unbind		= nv50_instmem_unbind;
-		engine->instmem.prepare_access	= nv50_instmem_prepare_access;
-		engine->instmem.finish_access	= nv50_instmem_finish_access;
+		if (dev_priv->chipset == 0x50)
+			engine->instmem.flush	= nv50_instmem_flush;
+		else
+			engine->instmem.flush	= nv84_instmem_flush;
 		engine->mc.init			= nv50_mc_init;
 		engine->mc.takedown		= nv50_mc_takedown;
 		engine->timer.init		= nv04_timer_init;
@@ -300,6 +344,64 @@ static int nouveau_init_engine_ptrs(struct drm_device *dev)
 		engine->fifo.destroy_context	= nv50_fifo_destroy_context;
 		engine->fifo.load_context	= nv50_fifo_load_context;
 		engine->fifo.unload_context	= nv50_fifo_unload_context;
+		engine->display.early_init	= nv50_display_early_init;
+		engine->display.late_takedown	= nv50_display_late_takedown;
+		engine->display.create		= nv50_display_create;
+		engine->display.init		= nv50_display_init;
+		engine->display.destroy		= nv50_display_destroy;
+		engine->gpio.init		= nv50_gpio_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv50_gpio_get;
+		engine->gpio.set		= nv50_gpio_set;
+		engine->gpio.irq_enable		= nv50_gpio_irq_enable;
+		break;
+	case 0xC0:
+		engine->instmem.init		= nvc0_instmem_init;
+		engine->instmem.takedown	= nvc0_instmem_takedown;
+		engine->instmem.suspend		= nvc0_instmem_suspend;
+		engine->instmem.resume		= nvc0_instmem_resume;
+		engine->instmem.populate	= nvc0_instmem_populate;
+		engine->instmem.clear		= nvc0_instmem_clear;
+		engine->instmem.bind		= nvc0_instmem_bind;
+		engine->instmem.unbind		= nvc0_instmem_unbind;
+		engine->instmem.flush		= nvc0_instmem_flush;
+		engine->mc.init			= nv50_mc_init;
+		engine->mc.takedown		= nv50_mc_takedown;
+		engine->timer.init		= nv04_timer_init;
+		engine->timer.read		= nv04_timer_read;
+		engine->timer.takedown		= nv04_timer_takedown;
+		engine->fb.init			= nvc0_fb_init;
+		engine->fb.takedown		= nvc0_fb_takedown;
+		engine->graph.grclass		= NULL;  //nvc0_graph_grclass;
+		engine->graph.init		= nvc0_graph_init;
+		engine->graph.takedown		= nvc0_graph_takedown;
+		engine->graph.fifo_access	= nvc0_graph_fifo_access;
+		engine->graph.channel		= nvc0_graph_channel;
+		engine->graph.create_context	= nvc0_graph_create_context;
+		engine->graph.destroy_context	= nvc0_graph_destroy_context;
+		engine->graph.load_context	= nvc0_graph_load_context;
+		engine->graph.unload_context	= nvc0_graph_unload_context;
+		engine->fifo.channels		= 128;
+		engine->fifo.init		= nvc0_fifo_init;
+		engine->fifo.takedown		= nvc0_fifo_takedown;
+		engine->fifo.disable		= nvc0_fifo_disable;
+		engine->fifo.enable		= nvc0_fifo_enable;
+		engine->fifo.reassign		= nvc0_fifo_reassign;
+		engine->fifo.channel_id		= nvc0_fifo_channel_id;
+		engine->fifo.create_context	= nvc0_fifo_create_context;
+		engine->fifo.destroy_context	= nvc0_fifo_destroy_context;
+		engine->fifo.load_context	= nvc0_fifo_load_context;
+		engine->fifo.unload_context	= nvc0_fifo_unload_context;
+		engine->display.early_init	= nv50_display_early_init;
+		engine->display.late_takedown	= nv50_display_late_takedown;
+		engine->display.create		= nv50_display_create;
+		engine->display.init		= nv50_display_init;
+		engine->display.destroy		= nv50_display_destroy;
+		engine->gpio.init		= nv50_gpio_init;
+		engine->gpio.takedown		= nouveau_stub_takedown;
+		engine->gpio.get		= nv50_gpio_get;
+		engine->gpio.set		= nv50_gpio_set;
+		engine->gpio.irq_enable		= nv50_gpio_irq_enable;
 		break;
 	default:
 		NV_ERROR(dev, "NV%02x unsupported\n", dev_priv->chipset);
@@ -331,16 +433,14 @@ static int
 nouveau_card_init_channel(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *gpuobj;
+	struct nouveau_gpuobj *gpuobj = NULL;
 	int ret;
 
 	ret = nouveau_channel_alloc(dev, &dev_priv->channel,
-				    (struct drm_file *)-2,
-				    NvDmaFB, NvDmaTT);
+				    (struct drm_file *)-2, NvDmaFB, NvDmaTT);
 	if (ret)
 		return ret;
 
-	gpuobj = NULL;
 	ret = nouveau_gpuobj_dma_new(dev_priv->channel, NV_CLASS_DMA_IN_MEMORY,
 				     0, dev_priv->vram_size,
 				     NV_DMA_ACCESS_RW, NV_DMA_TARGET_VIDMEM,
@@ -348,26 +448,25 @@ nouveau_card_init_channel(struct drm_device *dev)
 	if (ret)
 		goto out_err;
 
-	ret = nouveau_gpuobj_ref_add(dev, dev_priv->channel, NvDmaVRAM,
-				     gpuobj, NULL);
+	ret = nouveau_ramht_insert(dev_priv->channel, NvDmaVRAM, gpuobj);
+	nouveau_gpuobj_ref(NULL, &gpuobj);
 	if (ret)
 		goto out_err;
 
-	gpuobj = NULL;
 	ret = nouveau_gpuobj_gart_dma_new(dev_priv->channel, 0,
 					  dev_priv->gart_info.aper_size,
 					  NV_DMA_ACCESS_RW, &gpuobj, NULL);
 	if (ret)
 		goto out_err;
 
-	ret = nouveau_gpuobj_ref_add(dev, dev_priv->channel, NvDmaGART,
-				     gpuobj, NULL);
+	ret = nouveau_ramht_insert(dev_priv->channel, NvDmaGART, gpuobj);
+	nouveau_gpuobj_ref(NULL, &gpuobj);
 	if (ret)
 		goto out_err;
 
 	return 0;
+
 out_err:
-	nouveau_gpuobj_del(dev, &gpuobj);
 	nouveau_channel_free(dev_priv->channel);
 	dev_priv->channel = NULL;
 	return ret;
@@ -407,11 +506,6 @@ nouveau_card_init(struct drm_device *dev)
 	struct nouveau_engine *engine;
 	int ret;
 
-	NV_DEBUG(dev, "prev state = %d\n", dev_priv->init_state);
-
-	if (dev_priv->init_state == NOUVEAU_CARD_INIT_DONE)
-		return 0;
-
 	vga_client_register(dev->pdev, dev, NULL, nouveau_vga_set_decode);
 	vga_switcheroo_register_client(dev->pdev, nouveau_switcheroo_set_state,
 				       nouveau_switcheroo_can_switch);
@@ -421,50 +515,48 @@ nouveau_card_init(struct drm_device *dev)
 	if (ret)
 		goto out;
 	engine = &dev_priv->engine;
-	dev_priv->init_state = NOUVEAU_CARD_INIT_FAILED;
 	spin_lock_init(&dev_priv->context_switch_lock);
 
+	/* Make the CRTCs and I2C buses accessible */
+	ret = engine->display.early_init(dev);
+	if (ret)
+		goto out;
+
 	/* Parse BIOS tables / Run init tables if card not POSTed */
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		ret = nouveau_bios_init(dev);
-		if (ret)
-			goto out;
-	}
+	ret = nouveau_bios_init(dev);
+	if (ret)
+		goto out_display_early;
 
-	ret = nouveau_mem_detect(dev);
+	ret = nouveau_mem_vram_init(dev);
 	if (ret)
 		goto out_bios;
 
-	ret = nouveau_gpuobj_early_init(dev);
+	ret = nouveau_gpuobj_init(dev);
 	if (ret)
-		goto out_bios;
+		goto out_vram;
 
-	/* Initialise instance memory, must happen before mem_init so we
-	 * know exactly how much VRAM we're able to use for "normal"
-	 * purposes.
-	 */
 	ret = engine->instmem.init(dev);
 	if (ret)
-		goto out_gpuobj_early;
+		goto out_gpuobj;
 
-	/* Setup the memory manager */
-	ret = nouveau_mem_init(dev);
+	ret = nouveau_mem_gart_init(dev);
 	if (ret)
 		goto out_instmem;
 
-	ret = nouveau_gpuobj_init(dev);
-	if (ret)
-		goto out_mem;
-
 	/* PMC */
 	ret = engine->mc.init(dev);
 	if (ret)
-		goto out_gpuobj;
+		goto out_gart;
+
+	/* PGPIO */
+	ret = engine->gpio.init(dev);
+	if (ret)
+		goto out_mc;
 
 	/* PTIMER */
 	ret = engine->timer.init(dev);
 	if (ret)
-		goto out_mc;
+		goto out_gpio;
 
 	/* PFB */
 	ret = engine->fb.init(dev);
@@ -485,12 +577,16 @@ nouveau_card_init(struct drm_device *dev)
 			goto out_graph;
 	}
 
+	ret = engine->display.create(dev);
+	if (ret)
+		goto out_fifo;
+
 	/* this call irq_preinstall, register irq handler and
 	 * call irq_postinstall
 	 */
 	ret = drm_irq_install(dev);
 	if (ret)
-		goto out_fifo;
+		goto out_display;
 
 	ret = drm_vblank_init(dev, 0);
 	if (ret)
@@ -504,35 +600,18 @@ nouveau_card_init(struct drm_device *dev)
 			goto out_irq;
 	}
 
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		if (dev_priv->card_type >= NV_50)
-			ret = nv50_display_create(dev);
-		else
-			ret = nv04_display_create(dev);
-		if (ret)
-			goto out_channel;
-	}
-
 	ret = nouveau_backlight_init(dev);
 	if (ret)
 		NV_ERROR(dev, "Error %d registering backlight\n", ret);
 
-	dev_priv->init_state = NOUVEAU_CARD_INIT_DONE;
-
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		nouveau_fbcon_init(dev);
-		drm_kms_helper_poll_init(dev);
-	}
-
+	nouveau_fbcon_init(dev);
+	drm_kms_helper_poll_init(dev);
 	return 0;
 
-out_channel:
-	if (dev_priv->channel) {
-		nouveau_channel_free(dev_priv->channel);
-		dev_priv->channel = NULL;
-	}
 out_irq:
 	drm_irq_uninstall(dev);
+out_display:
+	engine->display.destroy(dev);
 out_fifo:
 	if (!nouveau_noaccel)
 		engine->fifo.takedown(dev);
@@ -543,19 +622,22 @@ out_fb:
 	engine->fb.takedown(dev);
 out_timer:
 	engine->timer.takedown(dev);
+out_gpio:
+	engine->gpio.takedown(dev);
 out_mc:
 	engine->mc.takedown(dev);
-out_gpuobj:
-	nouveau_gpuobj_takedown(dev);
-out_mem:
-	nouveau_sgdma_takedown(dev);
-	nouveau_mem_close(dev);
+out_gart:
+	nouveau_mem_gart_fini(dev);
 out_instmem:
 	engine->instmem.takedown(dev);
-out_gpuobj_early:
-	nouveau_gpuobj_late_takedown(dev);
+out_gpuobj:
+	nouveau_gpuobj_takedown(dev);
+out_vram:
+	nouveau_mem_vram_fini(dev);
 out_bios:
 	nouveau_bios_takedown(dev);
+out_display_early:
+	engine->display.late_takedown(dev);
 out:
 	vga_client_register(dev->pdev, NULL, NULL, NULL);
 	return ret;
@@ -566,45 +648,38 @@ static void nouveau_card_takedown(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_engine *engine = &dev_priv->engine;
 
-	NV_DEBUG(dev, "prev state = %d\n", dev_priv->init_state);
-
-	if (dev_priv->init_state != NOUVEAU_CARD_INIT_DOWN) {
+	nouveau_backlight_exit(dev);
 
-		nouveau_backlight_exit(dev);
-
-		if (dev_priv->channel) {
-			nouveau_channel_free(dev_priv->channel);
-			dev_priv->channel = NULL;
-		}
-
-		if (!nouveau_noaccel) {
-			engine->fifo.takedown(dev);
-			engine->graph.takedown(dev);
-		}
-		engine->fb.takedown(dev);
-		engine->timer.takedown(dev);
-		engine->mc.takedown(dev);
+	if (dev_priv->channel) {
+		nouveau_channel_free(dev_priv->channel);
+		dev_priv->channel = NULL;
+	}
 
-		mutex_lock(&dev->struct_mutex);
-		ttm_bo_clean_mm(&dev_priv->ttm.bdev, TTM_PL_VRAM);
-		ttm_bo_clean_mm(&dev_priv->ttm.bdev, TTM_PL_TT);
-		mutex_unlock(&dev->struct_mutex);
-		nouveau_sgdma_takedown(dev);
+	if (!nouveau_noaccel) {
+		engine->fifo.takedown(dev);
+		engine->graph.takedown(dev);
+	}
+	engine->fb.takedown(dev);
+	engine->timer.takedown(dev);
+	engine->gpio.takedown(dev);
+	engine->mc.takedown(dev);
+	engine->display.late_takedown(dev);
 
-		nouveau_gpuobj_takedown(dev);
-		nouveau_mem_close(dev);
-		engine->instmem.takedown(dev);
+	mutex_lock(&dev->struct_mutex);
+	ttm_bo_clean_mm(&dev_priv->ttm.bdev, TTM_PL_VRAM);
+	ttm_bo_clean_mm(&dev_priv->ttm.bdev, TTM_PL_TT);
+	mutex_unlock(&dev->struct_mutex);
+	nouveau_mem_gart_fini(dev);
 
-		if (drm_core_check_feature(dev, DRIVER_MODESET))
-			drm_irq_uninstall(dev);
+	engine->instmem.takedown(dev);
+	nouveau_gpuobj_takedown(dev);
+	nouveau_mem_vram_fini(dev);
 
-		nouveau_gpuobj_late_takedown(dev);
-		nouveau_bios_takedown(dev);
+	drm_irq_uninstall(dev);
 
-		vga_client_register(dev->pdev, NULL, NULL, NULL);
+	nouveau_bios_takedown(dev);
 
-		dev_priv->init_state = NOUVEAU_CARD_INIT_DOWN;
-	}
+	vga_client_register(dev->pdev, NULL, NULL, NULL);
 }
 
 /* here a client dies, release the stuff that was allocated for its
@@ -691,22 +766,26 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 	struct drm_nouveau_private *dev_priv;
 	uint32_t reg0;
 	resource_size_t mmio_start_offs;
+	int ret;
 
 	dev_priv = kzalloc(sizeof(*dev_priv), GFP_KERNEL);
-	if (!dev_priv)
-		return -ENOMEM;
+	if (!dev_priv) {
+		ret = -ENOMEM;
+		goto err_out;
+	}
 	dev->dev_private = dev_priv;
 	dev_priv->dev = dev;
 
 	dev_priv->flags = flags & NOUVEAU_FLAGS;
-	dev_priv->init_state = NOUVEAU_CARD_INIT_DOWN;
 
 	NV_DEBUG(dev, "vendor: 0x%X device: 0x%X class: 0x%X\n",
 		 dev->pci_vendor, dev->pci_device, dev->pdev->class);
 
 	dev_priv->wq = create_workqueue("nouveau");
-	if (!dev_priv->wq)
-		return -EINVAL;
+	if (!dev_priv->wq) {
+		ret = -EINVAL;
+		goto err_priv;
+	}
 
 	/* resource 0 is mmio regs */
 	/* resource 1 is linear FB */
@@ -719,7 +798,8 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 	if (!dev_priv->mmio) {
 		NV_ERROR(dev, "Unable to initialize the mmio mapping. "
 			 "Please report your setup to " DRIVER_EMAIL "\n");
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err_wq;
 	}
 	NV_DEBUG(dev, "regs mapped ok at 0x%llx\n",
 					(unsigned long long)mmio_start_offs);
@@ -765,19 +845,21 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 	case 0xa0:
 		dev_priv->card_type = NV_50;
 		break;
+	case 0xc0:
+		dev_priv->card_type = NV_C0;
+		break;
 	default:
 		NV_INFO(dev, "Unsupported chipset 0x%08x\n", reg0);
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err_mmio;
 	}
 
 	NV_INFO(dev, "Detected an NV%2x generation card (0x%08x)\n",
 		dev_priv->card_type, reg0);
 
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		int ret = nouveau_remove_conflicting_drivers(dev);
-		if (ret)
-			return ret;
-	}
+	ret = nouveau_remove_conflicting_drivers(dev);
+	if (ret)
+		goto err_mmio;
 
 	/* Map PRAMIN BAR, or on older cards, the aperture withing BAR0 */
 	if (dev_priv->card_type >= NV_40) {
@@ -791,7 +873,8 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 				dev_priv->ramin_size);
 		if (!dev_priv->ramin) {
 			NV_ERROR(dev, "Failed to PRAMIN BAR");
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto err_mmio;
 		}
 	} else {
 		dev_priv->ramin_size = 1 * 1024 * 1024;
@@ -799,7 +882,8 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 					  dev_priv->ramin_size);
 		if (!dev_priv->ramin) {
 			NV_ERROR(dev, "Failed to map BAR0 PRAMIN.\n");
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto err_mmio;
 		}
 	}
 
@@ -812,46 +896,38 @@ int nouveau_load(struct drm_device *dev, unsigned long flags)
 		dev_priv->flags |= NV_NFORCE2;
 
 	/* For kernel modesetting, init card now and bring up fbcon */
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		int ret = nouveau_card_init(dev);
-		if (ret)
-			return ret;
-	}
+	ret = nouveau_card_init(dev);
+	if (ret)
+		goto err_ramin;
 
 	return 0;
-}
-
-static void nouveau_close(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
-	/* In the case of an error dev_priv may not be allocated yet */
-	if (dev_priv)
-		nouveau_card_takedown(dev);
+err_ramin:
+	iounmap(dev_priv->ramin);
+err_mmio:
+	iounmap(dev_priv->mmio);
+err_wq:
+	destroy_workqueue(dev_priv->wq);
+err_priv:
+	kfree(dev_priv);
+	dev->dev_private = NULL;
+err_out:
+	return ret;
 }
 
-/* KMS: we need mmio at load time, not when the first drm client opens. */
 void nouveau_lastclose(struct drm_device *dev)
 {
-	if (drm_core_check_feature(dev, DRIVER_MODESET))
-		return;
-
-	nouveau_close(dev);
 }
 
 int nouveau_unload(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_engine *engine = &dev_priv->engine;
 
-	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		drm_kms_helper_poll_fini(dev);
-		nouveau_fbcon_fini(dev);
-		if (dev_priv->card_type >= NV_50)
-			nv50_display_destroy(dev);
-		else
-			nv04_display_destroy(dev);
-		nouveau_close(dev);
-	}
+	drm_kms_helper_poll_fini(dev);
+	nouveau_fbcon_fini(dev);
+	engine->display.destroy(dev);
+	nouveau_card_takedown(dev);
 
 	iounmap(dev_priv->mmio);
 	iounmap(dev_priv->ramin);
@@ -867,8 +943,6 @@ int nouveau_ioctl_getparam(struct drm_device *dev, void *data,
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct drm_nouveau_getparam *getparam = data;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	switch (getparam->param) {
 	case NOUVEAU_GETPARAM_CHIPSET_ID:
 		getparam->value = dev_priv->chipset;
@@ -937,8 +1011,6 @@ nouveau_ioctl_setparam(struct drm_device *dev, void *data,
 {
 	struct drm_nouveau_setparam *setparam = data;
 
-	NOUVEAU_CHECK_INITIALISED_WITH_RETURN;
-
 	switch (setparam->param) {
 	default:
 		NV_ERROR(dev, "unknown parameter %lld\n", setparam->param);
@@ -967,7 +1039,7 @@ bool nouveau_wait_until(struct drm_device *dev, uint64_t timeout,
 /* Waits for PGRAPH to go completely idle */
 bool nouveau_wait_for_idle(struct drm_device *dev)
 {
-	if (!nv_wait(NV04_PGRAPH_STATUS, 0xffffffff, 0x00000000)) {
+	if (!nv_wait(dev, NV04_PGRAPH_STATUS, 0xffffffff, 0x00000000)) {
 		NV_ERROR(dev, "PGRAPH idle timed out with status 0x%08x\n",
 			 nv_rd32(dev, NV04_PGRAPH_STATUS));
 		return false;
diff --git a/drivers/gpu/drm/nouveau/nv04_crtc.c b/drivers/gpu/drm/nouveau/nv04_crtc.c
index eba687f..291a4cb 100644
--- a/drivers/gpu/drm/nouveau/nv04_crtc.c
+++ b/drivers/gpu/drm/nouveau/nv04_crtc.c
@@ -157,6 +157,7 @@ nv_crtc_dpms(struct drm_crtc *crtc, int mode)
 {
 	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
+	struct drm_connector *connector;
 	unsigned char seq1 = 0, crtc17 = 0;
 	unsigned char crtc1A;
 
@@ -211,6 +212,10 @@ nv_crtc_dpms(struct drm_crtc *crtc, int mode)
 	NVVgaSeqReset(dev, nv_crtc->index, false);
 
 	NVWriteVgaCrtc(dev, nv_crtc->index, NV_CIO_CRE_RPC1_INDEX, crtc1A);
+
+	/* Update connector polling modes */
+	list_for_each_entry(connector, &dev->mode_config.connector_list, head)
+		nouveau_connector_set_polling(connector);
 }
 
 static bool
@@ -537,6 +542,9 @@ nv_crtc_mode_set_regs(struct drm_crtc *crtc, struct drm_display_mode * mode)
 	 * 1 << 30 on 0x60.830), for no apparent reason */
 	regp->CRTC[NV_CIO_CRE_59] = off_chip_digital;
 
+	if (dev_priv->card_type >= NV_30)
+		regp->CRTC[0x9f] = off_chip_digital ? 0x11 : 0x1;
+
 	regp->crtc_830 = mode->crtc_vdisplay - 3;
 	regp->crtc_834 = mode->crtc_vdisplay - 1;
 
@@ -710,6 +718,7 @@ static void nv_crtc_destroy(struct drm_crtc *crtc)
 
 	drm_crtc_cleanup(crtc);
 
+	nouveau_bo_unmap(nv_crtc->cursor.nvbo);
 	nouveau_bo_ref(NULL, &nv_crtc->cursor.nvbo);
 	kfree(nv_crtc);
 }
@@ -820,7 +829,7 @@ nv04_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
 	crtc_wr_cio_state(crtc, regp, NV_CIO_CRE_FF_INDEX);
 	crtc_wr_cio_state(crtc, regp, NV_CIO_CRE_FFLWM__INDEX);
 
-	if (dev_priv->card_type >= NV_30) {
+	if (dev_priv->card_type >= NV_20) {
 		regp->CRTC[NV_CIO_CRE_47] = arb_lwm >> 8;
 		crtc_wr_cio_state(crtc, regp, NV_CIO_CRE_47);
 	}
diff --git a/drivers/gpu/drm/nouveau/nv04_dac.c b/drivers/gpu/drm/nouveau/nv04_dac.c
index 1cb19e3..ba6423f 100644
--- a/drivers/gpu/drm/nouveau/nv04_dac.c
+++ b/drivers/gpu/drm/nouveau/nv04_dac.c
@@ -220,6 +220,7 @@ uint32_t nv17_dac_sample_load(struct drm_encoder *encoder)
 {
 	struct drm_device *dev = encoder->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_gpio_engine *gpio = &dev_priv->engine.gpio;
 	struct dcb_entry *dcb = nouveau_encoder(encoder)->dcb;
 	uint32_t sample, testval, regoffset = nv04_dac_output_offset(encoder);
 	uint32_t saved_powerctrl_2 = 0, saved_powerctrl_4 = 0, saved_routput,
@@ -251,22 +252,21 @@ uint32_t nv17_dac_sample_load(struct drm_encoder *encoder)
 		nvWriteMC(dev, NV_PBUS_POWERCTRL_4, saved_powerctrl_4 & 0xffffffcf);
 	}
 
-	saved_gpio1 = nv17_gpio_get(dev, DCB_GPIO_TVDAC1);
-	saved_gpio0 = nv17_gpio_get(dev, DCB_GPIO_TVDAC0);
+	saved_gpio1 = gpio->get(dev, DCB_GPIO_TVDAC1);
+	saved_gpio0 = gpio->get(dev, DCB_GPIO_TVDAC0);
 
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC1, dcb->type == OUTPUT_TV);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC0, dcb->type == OUTPUT_TV);
+	gpio->set(dev, DCB_GPIO_TVDAC1, dcb->type == OUTPUT_TV);
+	gpio->set(dev, DCB_GPIO_TVDAC0, dcb->type == OUTPUT_TV);
 
 	msleep(4);
 
 	saved_routput = NVReadRAMDAC(dev, 0, NV_PRAMDAC_DACCLK + regoffset);
 	head = (saved_routput & 0x100) >> 8;
-#if 0
-	/* if there's a spare crtc, using it will minimise flicker for the case
-	 * where the in-use crtc is in use by an off-chip tmds encoder */
-	if (xf86_config->crtc[head]->enabled && !xf86_config->crtc[head ^ 1]->enabled)
+
+	/* if there's a spare crtc, using it will minimise flicker */
+	if (!(NVReadVgaCrtc(dev, head, NV_CIO_CRE_RPC1_INDEX) & 0xC0))
 		head ^= 1;
-#endif
+
 	/* nv driver and nv31 use 0xfffffeee, nv34 and 6600 use 0xfffffece */
 	routput = (saved_routput & 0xfffffece) | head << 8;
 
@@ -291,6 +291,8 @@ uint32_t nv17_dac_sample_load(struct drm_encoder *encoder)
 	msleep(5);
 
 	sample = NVReadRAMDAC(dev, 0, NV_PRAMDAC_TEST_CONTROL + regoffset);
+	/* do it again just in case it's a residual current */
+	sample &= NVReadRAMDAC(dev, 0, NV_PRAMDAC_TEST_CONTROL + regoffset);
 
 	temp = NVReadRAMDAC(dev, head, NV_PRAMDAC_TEST_CONTROL);
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_TEST_CONTROL,
@@ -304,8 +306,8 @@ uint32_t nv17_dac_sample_load(struct drm_encoder *encoder)
 		nvWriteMC(dev, NV_PBUS_POWERCTRL_4, saved_powerctrl_4);
 	nvWriteMC(dev, NV_PBUS_POWERCTRL_2, saved_powerctrl_2);
 
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC1, saved_gpio1);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC0, saved_gpio0);
+	gpio->set(dev, DCB_GPIO_TVDAC1, saved_gpio1);
+	gpio->set(dev, DCB_GPIO_TVDAC0, saved_gpio0);
 
 	return sample;
 }
@@ -315,9 +317,12 @@ nv17_dac_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 {
 	struct drm_device *dev = encoder->dev;
 	struct dcb_entry *dcb = nouveau_encoder(encoder)->dcb;
-	uint32_t sample = nv17_dac_sample_load(encoder);
 
-	if (sample & NV_PRAMDAC_TEST_CONTROL_SENSEB_ALLHI) {
+	if (nv04_dac_in_use(encoder))
+		return connector_status_disconnected;
+
+	if (nv17_dac_sample_load(encoder) &
+	    NV_PRAMDAC_TEST_CONTROL_SENSEB_ALLHI) {
 		NV_INFO(dev, "Load detected on output %c\n",
 			'@' + ffs(dcb->or));
 		return connector_status_connected;
@@ -330,6 +335,9 @@ static bool nv04_dac_mode_fixup(struct drm_encoder *encoder,
 				struct drm_display_mode *mode,
 				struct drm_display_mode *adjusted_mode)
 {
+	if (nv04_dac_in_use(encoder))
+		return false;
+
 	return true;
 }
 
@@ -337,22 +345,13 @@ static void nv04_dac_prepare(struct drm_encoder *encoder)
 {
 	struct drm_encoder_helper_funcs *helper = encoder->helper_private;
 	struct drm_device *dev = encoder->dev;
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	int head = nouveau_crtc(encoder->crtc)->index;
-	struct nv04_crtc_reg *crtcstate = dev_priv->mode_reg.crtc_reg;
 
 	helper->dpms(encoder, DRM_MODE_DPMS_OFF);
 
 	nv04_dfp_disable(dev, head);
-
-	/* Some NV4x have unknown values (0x3f, 0x50, 0x54, 0x6b, 0x79, 0x7f)
-	 * at LCD__INDEX which we don't alter
-	 */
-	if (!(crtcstate[head].CRTC[NV_CIO_CRE_LCD__INDEX] & 0x44))
-		crtcstate[head].CRTC[NV_CIO_CRE_LCD__INDEX] = 0;
 }
 
-
 static void nv04_dac_mode_set(struct drm_encoder *encoder,
 			      struct drm_display_mode *mode,
 			      struct drm_display_mode *adjusted_mode)
@@ -428,6 +427,17 @@ void nv04_dac_update_dacclk(struct drm_encoder *encoder, bool enable)
 	}
 }
 
+/* Check if the DAC corresponding to 'encoder' is being used by
+ * someone else. */
+bool nv04_dac_in_use(struct drm_encoder *encoder)
+{
+	struct drm_nouveau_private *dev_priv = encoder->dev->dev_private;
+	struct dcb_entry *dcb = nouveau_encoder(encoder)->dcb;
+
+	return nv_gf4_disp_arch(encoder->dev) &&
+		(dev_priv->dac_users[ffs(dcb->or) - 1] & ~(1 << dcb->index));
+}
+
 static void nv04_dac_dpms(struct drm_encoder *encoder, int mode)
 {
 	struct drm_device *dev = encoder->dev;
@@ -501,11 +511,13 @@ static const struct drm_encoder_funcs nv04_dac_funcs = {
 	.destroy = nv04_dac_destroy,
 };
 
-int nv04_dac_create(struct drm_device *dev, struct dcb_entry *entry)
+int
+nv04_dac_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
 	const struct drm_encoder_helper_funcs *helper;
-	struct drm_encoder *encoder;
 	struct nouveau_encoder *nv_encoder = NULL;
+	struct drm_device *dev = connector->dev;
+	struct drm_encoder *encoder;
 
 	nv_encoder = kzalloc(sizeof(*nv_encoder), GFP_KERNEL);
 	if (!nv_encoder)
@@ -527,5 +539,6 @@ int nv04_dac_create(struct drm_device *dev, struct dcb_entry *entry)
 	encoder->possible_crtcs = entry->heads;
 	encoder->possible_clones = 0;
 
+	drm_mode_connector_attach_encoder(connector, encoder);
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nv04_dfp.c b/drivers/gpu/drm/nouveau/nv04_dfp.c
index 41634d4..762d9f2 100644
--- a/drivers/gpu/drm/nouveau/nv04_dfp.c
+++ b/drivers/gpu/drm/nouveau/nv04_dfp.c
@@ -34,6 +34,8 @@
 #include "nouveau_hw.h"
 #include "nvreg.h"
 
+#include "i2c/sil164.h"
+
 #define FP_TG_CONTROL_ON  (NV_PRAMDAC_FP_TG_CONTROL_DISPEN_POS |	\
 			   NV_PRAMDAC_FP_TG_CONTROL_HSYNC_POS |		\
 			   NV_PRAMDAC_FP_TG_CONTROL_VSYNC_POS)
@@ -102,6 +104,8 @@ void nv04_dfp_disable(struct drm_device *dev, int head)
 	}
 	/* don't inadvertently turn it on when state written later */
 	crtcstate[head].fp_control = FP_TG_CONTROL_OFF;
+	crtcstate[head].CRTC[NV_CIO_CRE_LCD__INDEX] &=
+		~NV_CIO_CRE_LCD_ROUTE_MASK;
 }
 
 void nv04_dfp_update_fp_control(struct drm_encoder *encoder, int mode)
@@ -144,6 +148,36 @@ void nv04_dfp_update_fp_control(struct drm_encoder *encoder, int mode)
 	}
 }
 
+static struct drm_encoder *get_tmds_slave(struct drm_encoder *encoder)
+{
+	struct drm_device *dev = encoder->dev;
+	struct dcb_entry *dcb = nouveau_encoder(encoder)->dcb;
+	struct drm_encoder *slave;
+
+	if (dcb->type != OUTPUT_TMDS || dcb->location == DCB_LOC_ON_CHIP)
+		return NULL;
+
+	/* Some BIOSes (e.g. the one in a Quadro FX1000) report several
+	 * TMDS transmitters at the same I2C address, in the same I2C
+	 * bus. This can still work because in that case one of them is
+	 * always hard-wired to a reasonable configuration using straps,
+	 * and the other one needs to be programmed.
+	 *
+	 * I don't think there's a way to know which is which, even the
+	 * blob programs the one exposed via I2C for *both* heads, so
+	 * let's do the same.
+	 */
+	list_for_each_entry(slave, &dev->mode_config.encoder_list, head) {
+		struct dcb_entry *slave_dcb = nouveau_encoder(slave)->dcb;
+
+		if (slave_dcb->type == OUTPUT_TMDS && get_slave_funcs(slave) &&
+		    slave_dcb->tmdsconf.slave_addr == dcb->tmdsconf.slave_addr)
+			return slave;
+	}
+
+	return NULL;
+}
+
 static bool nv04_dfp_mode_fixup(struct drm_encoder *encoder,
 				struct drm_display_mode *mode,
 				struct drm_display_mode *adjusted_mode)
@@ -221,26 +255,21 @@ static void nv04_dfp_prepare(struct drm_encoder *encoder)
 
 	nv04_dfp_prepare_sel_clk(dev, nv_encoder, head);
 
-	/* Some NV4x have unknown values (0x3f, 0x50, 0x54, 0x6b, 0x79, 0x7f)
-	 * at LCD__INDEX which we don't alter
-	 */
-	if (!(*cr_lcd & 0x44)) {
-		*cr_lcd = 0x3;
-
-		if (nv_two_heads(dev)) {
-			if (nv_encoder->dcb->location == DCB_LOC_ON_CHIP)
-				*cr_lcd |= head ? 0x0 : 0x8;
-			else {
-				*cr_lcd |= (nv_encoder->dcb->or << 4) & 0x30;
-				if (nv_encoder->dcb->type == OUTPUT_LVDS)
-					*cr_lcd |= 0x30;
-				if ((*cr_lcd & 0x30) == (*cr_lcd_oth & 0x30)) {
-					/* avoid being connected to both crtcs */
-					*cr_lcd_oth &= ~0x30;
-					NVWriteVgaCrtc(dev, head ^ 1,
-						       NV_CIO_CRE_LCD__INDEX,
-						       *cr_lcd_oth);
-				}
+	*cr_lcd = (*cr_lcd & ~NV_CIO_CRE_LCD_ROUTE_MASK) | 0x3;
+
+	if (nv_two_heads(dev)) {
+		if (nv_encoder->dcb->location == DCB_LOC_ON_CHIP)
+			*cr_lcd |= head ? 0x0 : 0x8;
+		else {
+			*cr_lcd |= (nv_encoder->dcb->or << 4) & 0x30;
+			if (nv_encoder->dcb->type == OUTPUT_LVDS)
+				*cr_lcd |= 0x30;
+			if ((*cr_lcd & 0x30) == (*cr_lcd_oth & 0x30)) {
+				/* avoid being connected to both crtcs */
+				*cr_lcd_oth &= ~0x30;
+				NVWriteVgaCrtc(dev, head ^ 1,
+					       NV_CIO_CRE_LCD__INDEX,
+					       *cr_lcd_oth);
 			}
 		}
 	}
@@ -412,10 +441,7 @@ static void nv04_dfp_commit(struct drm_encoder *encoder)
 	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
 	struct dcb_entry *dcbe = nv_encoder->dcb;
 	int head = nouveau_crtc(encoder->crtc)->index;
-
-	NV_INFO(dev, "Output %s is running on CRTC %d using output %c\n",
-		drm_get_connector_name(&nouveau_encoder_connector_get(nv_encoder)->base),
-		nv_crtc->index, '@' + ffs(nv_encoder->dcb->or));
+	struct drm_encoder *slave_encoder;
 
 	if (dcbe->type == OUTPUT_TMDS)
 		run_tmds_table(dev, dcbe, head, nv_encoder->mode.clock);
@@ -433,6 +459,12 @@ static void nv04_dfp_commit(struct drm_encoder *encoder)
 	else
 		NVWriteRAMDAC(dev, 0, NV_PRAMDAC_TEST_CONTROL + nv04_dac_output_offset(encoder), 0x00100000);
 
+	/* Init external transmitters */
+	slave_encoder = get_tmds_slave(encoder);
+	if (slave_encoder)
+		get_slave_funcs(slave_encoder)->mode_set(
+			slave_encoder, &nv_encoder->mode, &nv_encoder->mode);
+
 	helper->dpms(encoder, DRM_MODE_DPMS_ON);
 
 	NV_INFO(dev, "Output %s is running on CRTC %d using output %c\n",
@@ -440,6 +472,27 @@ static void nv04_dfp_commit(struct drm_encoder *encoder)
 		nv_crtc->index, '@' + ffs(nv_encoder->dcb->or));
 }
 
+static void nv04_dfp_update_backlight(struct drm_encoder *encoder, int mode)
+{
+#ifdef __powerpc__
+	struct drm_device *dev = encoder->dev;
+
+	/* BIOS scripts usually take care of the backlight, thanks
+	 * Apple for your consistency.
+	 */
+	if (dev->pci_device == 0x0179 || dev->pci_device == 0x0189 ||
+	    dev->pci_device == 0x0329) {
+		if (mode == DRM_MODE_DPMS_ON) {
+			nv_mask(dev, NV_PBUS_DEBUG_DUALHEAD_CTL, 0, 1 << 31);
+			nv_mask(dev, NV_PCRTC_GPIO_EXT, 3, 1);
+		} else {
+			nv_mask(dev, NV_PBUS_DEBUG_DUALHEAD_CTL, 1 << 31, 0);
+			nv_mask(dev, NV_PCRTC_GPIO_EXT, 3, 0);
+		}
+	}
+#endif
+}
+
 static inline bool is_powersaving_dpms(int mode)
 {
 	return (mode != DRM_MODE_DPMS_ON);
@@ -487,6 +540,7 @@ static void nv04_lvds_dpms(struct drm_encoder *encoder, int mode)
 					 LVDS_PANEL_OFF, 0);
 	}
 
+	nv04_dfp_update_backlight(encoder, mode);
 	nv04_dfp_update_fp_control(encoder, mode);
 
 	if (mode == DRM_MODE_DPMS_ON)
@@ -510,6 +564,7 @@ static void nv04_tmds_dpms(struct drm_encoder *encoder, int mode)
 	NV_INFO(dev, "Setting dpms mode %d on tmds encoder (output %d)\n",
 		     mode, nv_encoder->dcb->index);
 
+	nv04_dfp_update_backlight(encoder, mode);
 	nv04_dfp_update_fp_control(encoder, mode);
 }
 
@@ -554,10 +609,42 @@ static void nv04_dfp_destroy(struct drm_encoder *encoder)
 
 	NV_DEBUG_KMS(encoder->dev, "\n");
 
+	if (get_slave_funcs(encoder))
+		get_slave_funcs(encoder)->destroy(encoder);
+
 	drm_encoder_cleanup(encoder);
 	kfree(nv_encoder);
 }
 
+static void nv04_tmds_slave_init(struct drm_encoder *encoder)
+{
+	struct drm_device *dev = encoder->dev;
+	struct dcb_entry *dcb = nouveau_encoder(encoder)->dcb;
+	struct nouveau_i2c_chan *i2c = nouveau_i2c_find(dev, 2);
+	struct i2c_board_info info[] = {
+		{
+			.type = "sil164",
+			.addr = (dcb->tmdsconf.slave_addr == 0x7 ? 0x3a : 0x38),
+			.platform_data = &(struct sil164_encoder_params) {
+				SIL164_INPUT_EDGE_RISING
+			}
+		},
+		{ }
+	};
+	int type;
+
+	if (!nv_gf4_disp_arch(dev) || !i2c ||
+	    get_tmds_slave(encoder))
+		return;
+
+	type = nouveau_i2c_identify(dev, "TMDS transmitter", info, 2);
+	if (type < 0)
+		return;
+
+	drm_i2c_encoder_init(dev, to_encoder_slave(encoder),
+			     &i2c->adapter, &info[type]);
+}
+
 static const struct drm_encoder_helper_funcs nv04_lvds_helper_funcs = {
 	.dpms = nv04_lvds_dpms,
 	.save = nv04_dfp_save,
@@ -584,11 +671,12 @@ static const struct drm_encoder_funcs nv04_dfp_funcs = {
 	.destroy = nv04_dfp_destroy,
 };
 
-int nv04_dfp_create(struct drm_device *dev, struct dcb_entry *entry)
+int
+nv04_dfp_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
 	const struct drm_encoder_helper_funcs *helper;
-	struct drm_encoder *encoder;
 	struct nouveau_encoder *nv_encoder = NULL;
+	struct drm_encoder *encoder;
 	int type;
 
 	switch (entry->type) {
@@ -613,11 +701,16 @@ int nv04_dfp_create(struct drm_device *dev, struct dcb_entry *entry)
 	nv_encoder->dcb = entry;
 	nv_encoder->or = ffs(entry->or) - 1;
 
-	drm_encoder_init(dev, encoder, &nv04_dfp_funcs, type);
+	drm_encoder_init(connector->dev, encoder, &nv04_dfp_funcs, type);
 	drm_encoder_helper_add(encoder, helper);
 
 	encoder->possible_crtcs = entry->heads;
 	encoder->possible_clones = 0;
 
+	if (entry->type == OUTPUT_TMDS &&
+	    entry->location != DCB_LOC_ON_CHIP)
+		nv04_tmds_slave_init(encoder);
+
+	drm_mode_connector_attach_encoder(connector, encoder);
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nv04_display.c b/drivers/gpu/drm/nouveau/nv04_display.c
index c7898b4..9e28cf7 100644
--- a/drivers/gpu/drm/nouveau/nv04_display.c
+++ b/drivers/gpu/drm/nouveau/nv04_display.c
@@ -32,8 +32,6 @@
 #include "nouveau_encoder.h"
 #include "nouveau_connector.h"
 
-#define MULTIPLE_ENCODERS(e) (e & (e - 1))
-
 static void
 nv04_display_store_initial_head_owner(struct drm_device *dev)
 {
@@ -41,7 +39,7 @@ nv04_display_store_initial_head_owner(struct drm_device *dev)
 
 	if (dev_priv->chipset != 0x11) {
 		dev_priv->crtc_owner = NVReadVgaCrtc(dev, 0, NV_CIO_CRE_44);
-		goto ownerknown;
+		return;
 	}
 
 	/* reading CR44 is broken on nv11, so we attempt to infer it */
@@ -52,8 +50,6 @@ nv04_display_store_initial_head_owner(struct drm_device *dev)
 		bool tvA = false;
 		bool tvB = false;
 
-		NVLockVgaCrtcs(dev, false);
-
 		slaved_on_B = NVReadVgaCrtc(dev, 1, NV_CIO_CRE_PIXEL_INDEX) &
 									0x80;
 		if (slaved_on_B)
@@ -66,8 +62,6 @@ nv04_display_store_initial_head_owner(struct drm_device *dev)
 			tvA = !(NVReadVgaCrtc(dev, 0, NV_CIO_CRE_LCD__INDEX) &
 					MASK(NV_CIO_CRE_LCD_LCD_SELECT));
 
-		NVLockVgaCrtcs(dev, true);
-
 		if (slaved_on_A && !tvA)
 			dev_priv->crtc_owner = 0x0;
 		else if (slaved_on_B && !tvB)
@@ -79,14 +73,40 @@ nv04_display_store_initial_head_owner(struct drm_device *dev)
 		else
 			dev_priv->crtc_owner = 0x0;
 	}
+}
+
+int
+nv04_display_early_init(struct drm_device *dev)
+{
+	/* Make the I2C buses accessible. */
+	if (!nv_gf4_disp_arch(dev)) {
+		uint32_t pmc_enable = nv_rd32(dev, NV03_PMC_ENABLE);
+
+		if (!(pmc_enable & 1))
+			nv_wr32(dev, NV03_PMC_ENABLE, pmc_enable | 1);
+	}
 
-ownerknown:
-	NV_INFO(dev, "Initial CRTC_OWNER is %d\n", dev_priv->crtc_owner);
+	/* Unlock the VGA CRTCs. */
+	NVLockVgaCrtcs(dev, false);
+
+	/* Make sure the CRTCs aren't in slaved mode. */
+	if (nv_two_heads(dev)) {
+		nv04_display_store_initial_head_owner(dev);
+		NVSetOwner(dev, 0);
+	}
+
+	return 0;
+}
+
+void
+nv04_display_late_takedown(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	if (nv_two_heads(dev))
+		NVSetOwner(dev, dev_priv->crtc_owner);
 
-	/* we need to ensure the heads are not tied henceforth, or reading any
-	 * 8 bit reg on head B will fail
-	 * setting a single arbitrary head solves that */
-	NVSetOwner(dev, 0);
+	NVLockVgaCrtcs(dev, true);
 }
 
 int
@@ -94,14 +114,13 @@ nv04_display_create(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct dcb_table *dcb = &dev_priv->vbios.dcb;
+	struct drm_connector *connector, *ct;
 	struct drm_encoder *encoder;
 	struct drm_crtc *crtc;
 	int i, ret;
 
 	NV_DEBUG_KMS(dev, "\n");
 
-	if (nv_two_heads(dev))
-		nv04_display_store_initial_head_owner(dev);
 	nouveau_hw_save_vga_fonts(dev, 1);
 
 	drm_mode_config_init(dev);
@@ -132,19 +151,23 @@ nv04_display_create(struct drm_device *dev)
 	for (i = 0; i < dcb->entries; i++) {
 		struct dcb_entry *dcbent = &dcb->entry[i];
 
+		connector = nouveau_connector_create(dev, dcbent->connector);
+		if (IS_ERR(connector))
+			continue;
+
 		switch (dcbent->type) {
 		case OUTPUT_ANALOG:
-			ret = nv04_dac_create(dev, dcbent);
+			ret = nv04_dac_create(connector, dcbent);
 			break;
 		case OUTPUT_LVDS:
 		case OUTPUT_TMDS:
-			ret = nv04_dfp_create(dev, dcbent);
+			ret = nv04_dfp_create(connector, dcbent);
 			break;
 		case OUTPUT_TV:
 			if (dcbent->location == DCB_LOC_ON_CHIP)
-				ret = nv17_tv_create(dev, dcbent);
+				ret = nv17_tv_create(connector, dcbent);
 			else
-				ret = nv04_tv_create(dev, dcbent);
+				ret = nv04_tv_create(connector, dcbent);
 			break;
 		default:
 			NV_WARN(dev, "DCB type %d not known\n", dcbent->type);
@@ -155,12 +178,16 @@ nv04_display_create(struct drm_device *dev)
 			continue;
 	}
 
-	for (i = 0; i < dcb->connector.entries; i++)
-		nouveau_connector_create(dev, &dcb->connector.entry[i]);
+	list_for_each_entry_safe(connector, ct,
+				 &dev->mode_config.connector_list, head) {
+		if (!connector->encoder_ids[0]) {
+			NV_WARN(dev, "%s has no encoders, removing\n",
+				drm_get_connector_name(connector));
+			connector->funcs->destroy(connector);
+		}
+	}
 
 	/* Save previous state */
-	NVLockVgaCrtcs(dev, false);
-
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
 		crtc->funcs->save(crtc);
 
@@ -191,8 +218,6 @@ nv04_display_destroy(struct drm_device *dev)
 	}
 
 	/* Restore state */
-	NVLockVgaCrtcs(dev, false);
-
 	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
 		struct drm_encoder_helper_funcs *func = encoder->helper_private;
 
@@ -207,15 +232,12 @@ nv04_display_destroy(struct drm_device *dev)
 	nouveau_hw_save_vga_fonts(dev, 0);
 }
 
-void
-nv04_display_restore(struct drm_device *dev)
+int
+nv04_display_init(struct drm_device *dev)
 {
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct drm_encoder *encoder;
 	struct drm_crtc *crtc;
 
-	NVLockVgaCrtcs(dev, false);
-
 	/* meh.. modeset apparently doesn't setup all the regs and depends
 	 * on pre-existing state, for now load the state of the card *before*
 	 * nouveau was loaded, and then do a modeset.
@@ -233,12 +255,6 @@ nv04_display_restore(struct drm_device *dev)
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
 		crtc->funcs->restore(crtc);
 
-	if (nv_two_heads(dev)) {
-		NV_INFO(dev, "Restoring CRTC_OWNER to %d.\n",
-			dev_priv->crtc_owner);
-		NVSetOwner(dev, dev_priv->crtc_owner);
-	}
-
-	NVLockVgaCrtcs(dev, true);
+	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv04_fbcon.c b/drivers/gpu/drm/nouveau/nv04_fbcon.c
index 1eeac4f..33e4c93 100644
--- a/drivers/gpu/drm/nouveau/nv04_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nv04_fbcon.c
@@ -25,6 +25,7 @@
 #include "drmP.h"
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
+#include "nouveau_ramht.h"
 #include "nouveau_fbcon.h"
 
 void
@@ -169,11 +170,9 @@ nv04_fbcon_grobj_new(struct drm_device *dev, int class, uint32_t handle)
 	if (ret)
 		return ret;
 
-	ret = nouveau_gpuobj_ref_add(dev, dev_priv->channel, handle, obj, NULL);
-	if (ret)
-		return ret;
-
-	return 0;
+	ret = nouveau_ramht_insert(dev_priv->channel, handle, obj);
+	nouveau_gpuobj_ref(NULL, &obj);
+	return ret;
 }
 
 int
diff --git a/drivers/gpu/drm/nouveau/nv04_fifo.c b/drivers/gpu/drm/nouveau/nv04_fifo.c
index 66fe559..708293b 100644
--- a/drivers/gpu/drm/nouveau/nv04_fifo.c
+++ b/drivers/gpu/drm/nouveau/nv04_fifo.c
@@ -27,8 +27,9 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
+#include "nouveau_ramht.h"
 
-#define NV04_RAMFC(c) (dev_priv->ramfc_offset + ((c) * NV04_RAMFC__SIZE))
+#define NV04_RAMFC(c) (dev_priv->ramfc->pinst + ((c) * NV04_RAMFC__SIZE))
 #define NV04_RAMFC__SIZE 32
 #define NV04_RAMFC_DMA_PUT                                       0x00
 #define NV04_RAMFC_DMA_GET                                       0x04
@@ -38,10 +39,8 @@
 #define NV04_RAMFC_ENGINE                                        0x14
 #define NV04_RAMFC_PULL1_ENGINE                                  0x18
 
-#define RAMFC_WR(offset, val) nv_wo32(dev, chan->ramfc->gpuobj, \
-					 NV04_RAMFC_##offset/4, (val))
-#define RAMFC_RD(offset)      nv_ro32(dev, chan->ramfc->gpuobj, \
-					 NV04_RAMFC_##offset/4)
+#define RAMFC_WR(offset, val) nv_wo32(chan->ramfc, NV04_RAMFC_##offset, (val))
+#define RAMFC_RD(offset)      nv_ro32(chan->ramfc, NV04_RAMFC_##offset)
 
 void
 nv04_fifo_disable(struct drm_device *dev)
@@ -72,37 +71,32 @@ nv04_fifo_reassign(struct drm_device *dev, bool enable)
 }
 
 bool
-nv04_fifo_cache_flush(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_timer_engine *ptimer = &dev_priv->engine.timer;
-	uint64_t start = ptimer->read(dev);
-
-	do {
-		if (nv_rd32(dev, NV03_PFIFO_CACHE1_GET) ==
-		    nv_rd32(dev, NV03_PFIFO_CACHE1_PUT))
-			return true;
-
-	} while (ptimer->read(dev) - start < 100000000);
-
-	NV_ERROR(dev, "Timeout flushing the PFIFO cache.\n");
-
-	return false;
-}
-
-bool
 nv04_fifo_cache_pull(struct drm_device *dev, bool enable)
 {
-	uint32_t pull = nv_rd32(dev, NV04_PFIFO_CACHE1_PULL0);
+	int pull = nv_mask(dev, NV04_PFIFO_CACHE1_PULL0, 1, enable);
+
+	if (!enable) {
+		/* In some cases the PFIFO puller may be left in an
+		 * inconsistent state if you try to stop it when it's
+		 * busy translating handles. Sometimes you get a
+		 * PFIFO_CACHE_ERROR, sometimes it just fails silently
+		 * sending incorrect instance offsets to PGRAPH after
+		 * it's started up again. To avoid the latter we
+		 * invalidate the most recently calculated instance.
+		 */
+		if (!nv_wait(dev, NV04_PFIFO_CACHE1_PULL0,
+			     NV04_PFIFO_CACHE1_PULL0_HASH_BUSY, 0))
+			NV_ERROR(dev, "Timeout idling the PFIFO puller.\n");
+
+		if (nv_rd32(dev, NV04_PFIFO_CACHE1_PULL0) &
+		    NV04_PFIFO_CACHE1_PULL0_HASH_FAILED)
+			nv_wr32(dev, NV03_PFIFO_INTR_0,
+				NV_PFIFO_INTR_CACHE_ERROR);
 
-	if (enable) {
-		nv_wr32(dev, NV04_PFIFO_CACHE1_PULL0, pull | 1);
-	} else {
-		nv_wr32(dev, NV04_PFIFO_CACHE1_PULL0, pull & ~1);
 		nv_wr32(dev, NV04_PFIFO_CACHE1_HASH, 0);
 	}
 
-	return !!(pull & 1);
+	return pull & 1;
 }
 
 int
@@ -112,6 +106,12 @@ nv04_fifo_channel_id(struct drm_device *dev)
 			NV03_PFIFO_CACHE1_PUSH1_CHID_MASK;
 }
 
+#ifdef __BIG_ENDIAN
+#define DMA_FETCH_ENDIANNESS NV_PFIFO_CACHE1_BIG_ENDIAN
+#else
+#define DMA_FETCH_ENDIANNESS 0
+#endif
+
 int
 nv04_fifo_create_context(struct nouveau_channel *chan)
 {
@@ -124,25 +124,20 @@ nv04_fifo_create_context(struct nouveau_channel *chan)
 						NV04_RAMFC__SIZE,
 						NVOBJ_FLAG_ZERO_ALLOC |
 						NVOBJ_FLAG_ZERO_FREE,
-						NULL, &chan->ramfc);
+						&chan->ramfc);
 	if (ret)
 		return ret;
 
 	spin_lock_irqsave(&dev_priv->context_switch_lock, flags);
 
 	/* Setup initial state */
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	RAMFC_WR(DMA_PUT, chan->pushbuf_base);
 	RAMFC_WR(DMA_GET, chan->pushbuf_base);
-	RAMFC_WR(DMA_INSTANCE, chan->pushbuf->instance >> 4);
+	RAMFC_WR(DMA_INSTANCE, chan->pushbuf->pinst >> 4);
 	RAMFC_WR(DMA_FETCH, (NV_PFIFO_CACHE1_DMA_FETCH_TRIG_128_BYTES |
 			     NV_PFIFO_CACHE1_DMA_FETCH_SIZE_128_BYTES |
 			     NV_PFIFO_CACHE1_DMA_FETCH_MAX_REQS_8 |
-#ifdef __BIG_ENDIAN
-			     NV_PFIFO_CACHE1_BIG_ENDIAN |
-#endif
-			     0));
-	dev_priv->engine.instmem.finish_access(dev);
+			     DMA_FETCH_ENDIANNESS));
 
 	/* enable the fifo dma operation */
 	nv_wr32(dev, NV04_PFIFO_MODE,
@@ -160,7 +155,7 @@ nv04_fifo_destroy_context(struct nouveau_channel *chan)
 	nv_wr32(dev, NV04_PFIFO_MODE,
 		nv_rd32(dev, NV04_PFIFO_MODE) & ~(1 << chan->id));
 
-	nouveau_gpuobj_ref_del(dev, &chan->ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->ramfc);
 }
 
 static void
@@ -169,8 +164,6 @@ nv04_fifo_do_load_context(struct drm_device *dev, int chid)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	uint32_t fc = NV04_RAMFC(chid), tmp;
 
-	dev_priv->engine.instmem.prepare_access(dev, false);
-
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_PUT, nv_ri32(dev, fc + 0));
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_GET, nv_ri32(dev, fc + 4));
 	tmp = nv_ri32(dev, fc + 8);
@@ -181,8 +174,6 @@ nv04_fifo_do_load_context(struct drm_device *dev, int chid)
 	nv_wr32(dev, NV04_PFIFO_CACHE1_ENGINE, nv_ri32(dev, fc + 20));
 	nv_wr32(dev, NV04_PFIFO_CACHE1_PULL1, nv_ri32(dev, fc + 24));
 
-	dev_priv->engine.instmem.finish_access(dev);
-
 	nv_wr32(dev, NV03_PFIFO_CACHE1_GET, 0);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUT, 0);
 }
@@ -223,7 +214,6 @@ nv04_fifo_unload_context(struct drm_device *dev)
 		return -EINVAL;
 	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	RAMFC_WR(DMA_PUT, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_PUT));
 	RAMFC_WR(DMA_GET, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_GET));
 	tmp  = nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_DCOUNT) << 16;
@@ -233,7 +223,6 @@ nv04_fifo_unload_context(struct drm_device *dev)
 	RAMFC_WR(DMA_FETCH, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_FETCH));
 	RAMFC_WR(ENGINE, nv_rd32(dev, NV04_PFIFO_CACHE1_ENGINE));
 	RAMFC_WR(PULL1_ENGINE, nv_rd32(dev, NV04_PFIFO_CACHE1_PULL1));
-	dev_priv->engine.instmem.finish_access(dev);
 
 	nv04_fifo_do_load_context(dev, pfifo->channels - 1);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUSH1, pfifo->channels - 1);
@@ -269,10 +258,10 @@ nv04_fifo_init_ramxx(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
 	nv_wr32(dev, NV03_PFIFO_RAMHT, (0x03 << 24) /* search 128 */ |
-				       ((dev_priv->ramht_bits - 9) << 16) |
-				       (dev_priv->ramht_offset >> 8));
-	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro_offset>>8);
-	nv_wr32(dev, NV03_PFIFO_RAMFC, dev_priv->ramfc_offset >> 8);
+				       ((dev_priv->ramht->bits - 9) << 16) |
+				       (dev_priv->ramht->gpuobj->pinst >> 8));
+	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro->pinst >> 8);
+	nv_wr32(dev, NV03_PFIFO_RAMFC, dev_priv->ramfc->pinst >> 8);
 }
 
 static void
@@ -297,6 +286,7 @@ nv04_fifo_init(struct drm_device *dev)
 
 	nv04_fifo_init_intr(dev);
 	pfifo->enable(dev);
+	pfifo->reassign(dev, true);
 
 	for (i = 0; i < dev_priv->engine.fifo.channels; i++) {
 		if (dev_priv->fifos[i]) {
diff --git a/drivers/gpu/drm/nouveau/nv04_graph.c b/drivers/gpu/drm/nouveau/nv04_graph.c
index 618355e..c897342 100644
--- a/drivers/gpu/drm/nouveau/nv04_graph.c
+++ b/drivers/gpu/drm/nouveau/nv04_graph.c
@@ -342,7 +342,7 @@ static uint32_t nv04_graph_ctx_regs[] = {
 };
 
 struct graph_state {
-	int nv04[ARRAY_SIZE(nv04_graph_ctx_regs)];
+	uint32_t nv04[ARRAY_SIZE(nv04_graph_ctx_regs)];
 };
 
 struct nouveau_channel *
@@ -527,8 +527,7 @@ static int
 nv04_graph_mthd_set_ref(struct nouveau_channel *chan, int grclass,
 			int mthd, uint32_t data)
 {
-	chan->fence.last_sequence_irq = data;
-	nouveau_fence_handler(chan->dev, chan->id);
+	atomic_set(&chan->fence.last_sequence_irq, data);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv04_instmem.c b/drivers/gpu/drm/nouveau/nv04_instmem.c
index a3b9563..0b5ae29 100644
--- a/drivers/gpu/drm/nouveau/nv04_instmem.c
+++ b/drivers/gpu/drm/nouveau/nv04_instmem.c
@@ -1,6 +1,7 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
+#include "nouveau_ramht.h"
 
 /* returns the size of fifo context */
 static int
@@ -17,104 +18,51 @@ nouveau_fifo_ctx_size(struct drm_device *dev)
 	return 32;
 }
 
-static void
-nv04_instmem_determine_amount(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	int i;
-
-	/* Figure out how much instance memory we need */
-	if (dev_priv->card_type >= NV_40) {
-		/* We'll want more instance memory than this on some NV4x cards.
-		 * There's a 16MB aperture to play with that maps onto the end
-		 * of vram.  For now, only reserve a small piece until we know
-		 * more about what each chipset requires.
-		 */
-		switch (dev_priv->chipset) {
-		case 0x40:
-		case 0x47:
-		case 0x49:
-		case 0x4b:
-			dev_priv->ramin_rsvd_vram = (2 * 1024 * 1024);
-			break;
-		default:
-			dev_priv->ramin_rsvd_vram = (1 * 1024 * 1024);
-			break;
-		}
-	} else {
-		/*XXX: what *are* the limits on <NV40 cards?
-		 */
-		dev_priv->ramin_rsvd_vram = (512 * 1024);
-	}
-	NV_DEBUG(dev, "RAMIN size: %dKiB\n", dev_priv->ramin_rsvd_vram >> 10);
-
-	/* Clear all of it, except the BIOS image that's in the first 64KiB */
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	for (i = 64 * 1024; i < dev_priv->ramin_rsvd_vram; i += 4)
-		nv_wi32(dev, i, 0x00000000);
-	dev_priv->engine.instmem.finish_access(dev);
-}
-
-static void
-nv04_instmem_configure_fixed_tables(struct drm_device *dev)
+int nv04_instmem_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_engine *engine = &dev_priv->engine;
-
-	/* FIFO hash table (RAMHT)
-	 *   use 4k hash table at RAMIN+0x10000
-	 *   TODO: extend the hash table
-	 */
-	dev_priv->ramht_offset = 0x10000;
-	dev_priv->ramht_bits   = 9;
-	dev_priv->ramht_size   = (1 << dev_priv->ramht_bits); /* nr entries */
-	dev_priv->ramht_size  *= 8; /* 2 32-bit values per entry in RAMHT */
-	NV_DEBUG(dev, "RAMHT offset=0x%x, size=%d\n", dev_priv->ramht_offset,
-						      dev_priv->ramht_size);
-
-	/* FIFO runout table (RAMRO) - 512k at 0x11200 */
-	dev_priv->ramro_offset = 0x11200;
-	dev_priv->ramro_size   = 512;
-	NV_DEBUG(dev, "RAMRO offset=0x%x, size=%d\n", dev_priv->ramro_offset,
-						      dev_priv->ramro_size);
-
-	/* FIFO context table (RAMFC)
-	 *   NV40  : Not sure exactly how to position RAMFC on some cards,
-	 *           0x30002 seems to position it at RAMIN+0x20000 on these
-	 *           cards.  RAMFC is 4kb (32 fifos, 128byte entries).
-	 *   Others: Position RAMFC at RAMIN+0x11400
-	 */
-	dev_priv->ramfc_size = engine->fifo.channels *
-						nouveau_fifo_ctx_size(dev);
+	struct nouveau_gpuobj *ramht = NULL;
+	u32 offset, length;
+	int ret;
+
+	/* RAMIN always available */
+	dev_priv->ramin_available = true;
+
+	/* Setup shared RAMHT */
+	ret = nouveau_gpuobj_new_fake(dev, 0x10000, ~0, 4096,
+				      NVOBJ_FLAG_ZERO_ALLOC, &ramht);
+	if (ret)
+		return ret;
+
+	ret = nouveau_ramht_new(dev, ramht, &dev_priv->ramht);
+	nouveau_gpuobj_ref(NULL, &ramht);
+	if (ret)
+		return ret;
+
+	/* And RAMRO */
+	ret = nouveau_gpuobj_new_fake(dev, 0x11200, ~0, 512,
+				      NVOBJ_FLAG_ZERO_ALLOC, &dev_priv->ramro);
+	if (ret)
+		return ret;
+
+	/* And RAMFC */
+	length = dev_priv->engine.fifo.channels * nouveau_fifo_ctx_size(dev);
 	switch (dev_priv->card_type) {
 	case NV_40:
-		dev_priv->ramfc_offset = 0x20000;
+		offset = 0x20000;
 		break;
-	case NV_30:
-	case NV_20:
-	case NV_10:
-	case NV_04:
 	default:
-		dev_priv->ramfc_offset = 0x11400;
+		offset = 0x11400;
 		break;
 	}
-	NV_DEBUG(dev, "RAMFC offset=0x%x, size=%d\n", dev_priv->ramfc_offset,
-						      dev_priv->ramfc_size);
-}
 
-int nv04_instmem_init(struct drm_device *dev)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t offset;
-	int ret = 0;
+	ret = nouveau_gpuobj_new_fake(dev, offset, ~0, length,
+				      NVOBJ_FLAG_ZERO_ALLOC, &dev_priv->ramfc);
+	if (ret)
+		return ret;
 
-	nv04_instmem_determine_amount(dev);
-	nv04_instmem_configure_fixed_tables(dev);
-
-	/* Create a heap to manage RAMIN allocations, we don't allocate
-	 * the space that was reserved for RAMHT/FC/RO.
-	 */
-	offset = dev_priv->ramfc_offset + dev_priv->ramfc_size;
+	/* Only allow space after RAMFC to be used for object allocation */
+	offset += length;
 
 	/* It appears RAMRO (or something?) is controlled by 0x2220/0x2230
 	 * on certain NV4x chipsets as well as RAMFC.  When 0x2230 == 0
@@ -129,69 +77,52 @@ int nv04_instmem_init(struct drm_device *dev)
 			offset = 0x40000;
 	}
 
-	ret = nouveau_mem_init_heap(&dev_priv->ramin_heap,
-				    offset, dev_priv->ramin_rsvd_vram - offset);
+	ret = drm_mm_init(&dev_priv->ramin_heap, offset,
+			  dev_priv->ramin_rsvd_vram - offset);
 	if (ret) {
-		dev_priv->ramin_heap = NULL;
-		NV_ERROR(dev, "Failed to init RAMIN heap\n");
+		NV_ERROR(dev, "Failed to init RAMIN heap: %d\n", ret);
+		return ret;
 	}
 
-	return ret;
+	return 0;
 }
 
 void
 nv04_instmem_takedown(struct drm_device *dev)
 {
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	nouveau_ramht_ref(NULL, &dev_priv->ramht, NULL);
+	nouveau_gpuobj_ref(NULL, &dev_priv->ramro);
+	nouveau_gpuobj_ref(NULL, &dev_priv->ramfc);
 }
 
 int
-nv04_instmem_populate(struct drm_device *dev, struct nouveau_gpuobj *gpuobj, uint32_t *sz)
+nv04_instmem_populate(struct drm_device *dev, struct nouveau_gpuobj *gpuobj,
+		      uint32_t *sz)
 {
-	if (gpuobj->im_backing)
-		return -EINVAL;
-
 	return 0;
 }
 
 void
 nv04_instmem_clear(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 {
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-
-	if (gpuobj && gpuobj->im_backing) {
-		if (gpuobj->im_bound)
-			dev_priv->engine.instmem.unbind(dev, gpuobj);
-		gpuobj->im_backing = NULL;
-	}
 }
 
 int
 nv04_instmem_bind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 {
-	if (!gpuobj->im_pramin || gpuobj->im_bound)
-		return -EINVAL;
-
-	gpuobj->im_bound = 1;
 	return 0;
 }
 
 int
 nv04_instmem_unbind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 {
-	if (gpuobj->im_bound == 0)
-		return -EINVAL;
-
-	gpuobj->im_bound = 0;
 	return 0;
 }
 
 void
-nv04_instmem_prepare_access(struct drm_device *dev, bool write)
-{
-}
-
-void
-nv04_instmem_finish_access(struct drm_device *dev)
+nv04_instmem_flush(struct drm_device *dev)
 {
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv04_mc.c b/drivers/gpu/drm/nouveau/nv04_mc.c
index 617ed1e..2af43a1 100644
--- a/drivers/gpu/drm/nouveau/nv04_mc.c
+++ b/drivers/gpu/drm/nouveau/nv04_mc.c
@@ -11,6 +11,10 @@ nv04_mc_init(struct drm_device *dev)
 	 */
 
 	nv_wr32(dev, NV03_PMC_ENABLE, 0xFFFFFFFF);
+
+	/* Disable PROM access. */
+	nv_wr32(dev, NV_PBUS_PCI_NV_20, NV_PBUS_PCI_NV_20_ROM_SHADOW_ENABLED);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv04_tv.c b/drivers/gpu/drm/nouveau/nv04_tv.c
index c4e3404..9915a3b 100644
--- a/drivers/gpu/drm/nouveau/nv04_tv.c
+++ b/drivers/gpu/drm/nouveau/nv04_tv.c
@@ -34,69 +34,26 @@
 
 #include "i2c/ch7006.h"
 
-static struct {
-	struct i2c_board_info board_info;
-	struct drm_encoder_funcs funcs;
-	struct drm_encoder_helper_funcs hfuncs;
-	void *params;
-
-} nv04_tv_encoder_info[] = {
+static struct i2c_board_info nv04_tv_encoder_info[] = {
 	{
-		.board_info = { I2C_BOARD_INFO("ch7006", 0x75) },
-		.params = &(struct ch7006_encoder_params) {
+		I2C_BOARD_INFO("ch7006", 0x75),
+		.platform_data = &(struct ch7006_encoder_params) {
 			CH7006_FORMAT_RGB24m12I, CH7006_CLOCK_MASTER,
 			0, 0, 0,
 			CH7006_SYNC_SLAVE, CH7006_SYNC_SEPARATED,
 			CH7006_POUT_3_3V, CH7006_ACTIVE_HSYNC
-		},
+		}
 	},
+	{ }
 };
 
-static bool probe_i2c_addr(struct i2c_adapter *adapter, int addr)
-{
-	struct i2c_msg msg = {
-		.addr = addr,
-		.len = 0,
-	};
-
-	return i2c_transfer(adapter, &msg, 1) == 1;
-}
-
 int nv04_tv_identify(struct drm_device *dev, int i2c_index)
 {
-	struct nouveau_i2c_chan *i2c;
-	bool was_locked;
-	int i, ret;
-
-	NV_TRACE(dev, "Probing TV encoders on I2C bus: %d\n", i2c_index);
-
-	i2c = nouveau_i2c_find(dev, i2c_index);
-	if (!i2c)
-		return -ENODEV;
-
-	was_locked = NVLockVgaCrtcs(dev, false);
-
-	for (i = 0; i < ARRAY_SIZE(nv04_tv_encoder_info); i++) {
-		if (probe_i2c_addr(&i2c->adapter,
-				   nv04_tv_encoder_info[i].board_info.addr)) {
-			ret = i;
-			break;
-		}
-	}
-
-	if (i < ARRAY_SIZE(nv04_tv_encoder_info)) {
-		NV_TRACE(dev, "Detected TV encoder: %s\n",
-			 nv04_tv_encoder_info[i].board_info.type);
-
-	} else {
-		NV_TRACE(dev, "No TV encoders found.\n");
-		i = -ENODEV;
-	}
-
-	NVLockVgaCrtcs(dev, was_locked);
-	return i;
+	return nouveau_i2c_identify(dev, "TV encoder",
+				    nv04_tv_encoder_info, i2c_index);
 }
 
+
 #define PLLSEL_TV_CRTC1_MASK				\
 	(NV_PRAMDAC_PLL_COEFF_SELECT_TV_VSCLK1		\
 	 | NV_PRAMDAC_PLL_COEFF_SELECT_TV_PCLK1)
@@ -132,7 +89,7 @@ static void nv04_tv_dpms(struct drm_encoder *encoder, int mode)
 
 	NVWriteRAMDAC(dev, 0, NV_PRAMDAC_PLL_COEFF_SELECT, state->pllsel);
 
-	to_encoder_slave(encoder)->slave_funcs->dpms(encoder, mode);
+	get_slave_funcs(encoder)->dpms(encoder, mode);
 }
 
 static void nv04_tv_bind(struct drm_device *dev, int head, bool bind)
@@ -142,12 +99,10 @@ static void nv04_tv_bind(struct drm_device *dev, int head, bool bind)
 
 	state->tv_setup = 0;
 
-	if (bind) {
-		state->CRTC[NV_CIO_CRE_LCD__INDEX] = 0;
+	if (bind)
 		state->CRTC[NV_CIO_CRE_49] |= 0x10;
-	} else {
+	else
 		state->CRTC[NV_CIO_CRE_49] &= ~0x10;
-	}
 
 	NVWriteVgaCrtc(dev, head, NV_CIO_CRE_LCD__INDEX,
 		       state->CRTC[NV_CIO_CRE_LCD__INDEX]);
@@ -195,7 +150,7 @@ static void nv04_tv_mode_set(struct drm_encoder *encoder,
 	regp->tv_vskew = 1;
 	regp->tv_vsync_delay = 1;
 
-	to_encoder_slave(encoder)->slave_funcs->mode_set(encoder, mode, adjusted_mode);
+	get_slave_funcs(encoder)->mode_set(encoder, mode, adjusted_mode);
 }
 
 static void nv04_tv_commit(struct drm_encoder *encoder)
@@ -214,30 +169,31 @@ static void nv04_tv_commit(struct drm_encoder *encoder)
 
 static void nv04_tv_destroy(struct drm_encoder *encoder)
 {
-	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-
-	to_encoder_slave(encoder)->slave_funcs->destroy(encoder);
-
+	get_slave_funcs(encoder)->destroy(encoder);
 	drm_encoder_cleanup(encoder);
 
-	kfree(nv_encoder);
+	kfree(encoder->helper_private);
+	kfree(nouveau_encoder(encoder));
 }
 
-int nv04_tv_create(struct drm_device *dev, struct dcb_entry *entry)
+static const struct drm_encoder_funcs nv04_tv_funcs = {
+	.destroy = nv04_tv_destroy,
+};
+
+int
+nv04_tv_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
 	struct nouveau_encoder *nv_encoder;
 	struct drm_encoder *encoder;
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct i2c_adapter *adap;
-	struct drm_encoder_funcs *funcs = NULL;
-	struct drm_encoder_helper_funcs *hfuncs = NULL;
-	struct drm_encoder_slave_funcs *sfuncs = NULL;
-	int i2c_index = entry->i2c_index;
+	struct drm_device *dev = connector->dev;
+	struct drm_encoder_helper_funcs *hfuncs;
+	struct drm_encoder_slave_funcs *sfuncs;
+	struct nouveau_i2c_chan *i2c =
+		nouveau_i2c_find(dev, entry->i2c_index);
 	int type, ret;
-	bool was_locked;
 
 	/* Ensure that we can talk to this encoder */
-	type = nv04_tv_identify(dev, i2c_index);
+	type = nv04_tv_identify(dev, entry->i2c_index);
 	if (type < 0)
 		return type;
 
@@ -246,40 +202,31 @@ int nv04_tv_create(struct drm_device *dev, struct dcb_entry *entry)
 	if (!nv_encoder)
 		return -ENOMEM;
 
+	hfuncs = kzalloc(sizeof(*hfuncs), GFP_KERNEL);
+	if (!hfuncs) {
+		ret = -ENOMEM;
+		goto fail_free;
+	}
+
 	/* Initialize the common members */
 	encoder = to_drm_encoder(nv_encoder);
 
-	funcs = &nv04_tv_encoder_info[type].funcs;
-	hfuncs = &nv04_tv_encoder_info[type].hfuncs;
-
-	drm_encoder_init(dev, encoder, funcs, DRM_MODE_ENCODER_TVDAC);
+	drm_encoder_init(dev, encoder, &nv04_tv_funcs, DRM_MODE_ENCODER_TVDAC);
 	drm_encoder_helper_add(encoder, hfuncs);
 
 	encoder->possible_crtcs = entry->heads;
 	encoder->possible_clones = 0;
-
 	nv_encoder->dcb = entry;
 	nv_encoder->or = ffs(entry->or) - 1;
 
 	/* Run the slave-specific initialization */
-	adap = &dev_priv->vbios.dcb.i2c[i2c_index].chan->adapter;
-
-	was_locked = NVLockVgaCrtcs(dev, false);
-
-	ret = drm_i2c_encoder_init(encoder->dev, to_encoder_slave(encoder), adap,
-				   &nv04_tv_encoder_info[type].board_info);
-
-	NVLockVgaCrtcs(dev, was_locked);
-
+	ret = drm_i2c_encoder_init(dev, to_encoder_slave(encoder),
+				   &i2c->adapter, &nv04_tv_encoder_info[type]);
 	if (ret < 0)
-		goto fail;
+		goto fail_cleanup;
 
 	/* Fill the function pointers */
-	sfuncs = to_encoder_slave(encoder)->slave_funcs;
-
-	*funcs = (struct drm_encoder_funcs) {
-		.destroy = nv04_tv_destroy,
-	};
+	sfuncs = get_slave_funcs(encoder);
 
 	*hfuncs = (struct drm_encoder_helper_funcs) {
 		.dpms = nv04_tv_dpms,
@@ -292,14 +239,16 @@ int nv04_tv_create(struct drm_device *dev, struct dcb_entry *entry)
 		.detect = sfuncs->detect,
 	};
 
-	/* Set the slave encoder configuration */
-	sfuncs->set_config(encoder, nv04_tv_encoder_info[type].params);
+	/* Attach it to the specified connector. */
+	sfuncs->create_resources(encoder, connector);
+	drm_mode_connector_attach_encoder(connector, encoder);
 
 	return 0;
 
-fail:
+fail_cleanup:
 	drm_encoder_cleanup(encoder);
-
+	kfree(hfuncs);
+fail_free:
 	kfree(nv_encoder);
 	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/nv10_fifo.c b/drivers/gpu/drm/nouveau/nv10_fifo.c
index 7aeabf2..f1b03ad 100644
--- a/drivers/gpu/drm/nouveau/nv10_fifo.c
+++ b/drivers/gpu/drm/nouveau/nv10_fifo.c
@@ -27,8 +27,9 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
+#include "nouveau_ramht.h"
 
-#define NV10_RAMFC(c) (dev_priv->ramfc_offset + ((c) * NV10_RAMFC__SIZE))
+#define NV10_RAMFC(c) (dev_priv->ramfc->pinst + ((c) * NV10_RAMFC__SIZE))
 #define NV10_RAMFC__SIZE ((dev_priv->chipset) >= 0x17 ? 64 : 32)
 
 int
@@ -48,17 +49,16 @@ nv10_fifo_create_context(struct nouveau_channel *chan)
 
 	ret = nouveau_gpuobj_new_fake(dev, NV10_RAMFC(chan->id), ~0,
 				      NV10_RAMFC__SIZE, NVOBJ_FLAG_ZERO_ALLOC |
-				      NVOBJ_FLAG_ZERO_FREE, NULL, &chan->ramfc);
+				      NVOBJ_FLAG_ZERO_FREE, &chan->ramfc);
 	if (ret)
 		return ret;
 
 	/* Fill entries that are seen filled in dumps of nvidia driver just
 	 * after channel's is put into DMA mode
 	 */
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	nv_wi32(dev, fc +  0, chan->pushbuf_base);
 	nv_wi32(dev, fc +  4, chan->pushbuf_base);
-	nv_wi32(dev, fc + 12, chan->pushbuf->instance >> 4);
+	nv_wi32(dev, fc + 12, chan->pushbuf->pinst >> 4);
 	nv_wi32(dev, fc + 20, NV_PFIFO_CACHE1_DMA_FETCH_TRIG_128_BYTES |
 			      NV_PFIFO_CACHE1_DMA_FETCH_SIZE_128_BYTES |
 			      NV_PFIFO_CACHE1_DMA_FETCH_MAX_REQS_8 |
@@ -66,7 +66,6 @@ nv10_fifo_create_context(struct nouveau_channel *chan)
 			      NV_PFIFO_CACHE1_BIG_ENDIAN |
 #endif
 			      0);
-	dev_priv->engine.instmem.finish_access(dev);
 
 	/* enable the fifo dma operation */
 	nv_wr32(dev, NV04_PFIFO_MODE,
@@ -82,7 +81,7 @@ nv10_fifo_destroy_context(struct nouveau_channel *chan)
 	nv_wr32(dev, NV04_PFIFO_MODE,
 			nv_rd32(dev, NV04_PFIFO_MODE) & ~(1 << chan->id));
 
-	nouveau_gpuobj_ref_del(dev, &chan->ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->ramfc);
 }
 
 static void
@@ -91,8 +90,6 @@ nv10_fifo_do_load_context(struct drm_device *dev, int chid)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	uint32_t fc = NV10_RAMFC(chid), tmp;
 
-	dev_priv->engine.instmem.prepare_access(dev, false);
-
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_PUT, nv_ri32(dev, fc + 0));
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_GET, nv_ri32(dev, fc + 4));
 	nv_wr32(dev, NV10_PFIFO_CACHE1_REF_CNT, nv_ri32(dev, fc + 8));
@@ -117,8 +114,6 @@ nv10_fifo_do_load_context(struct drm_device *dev, int chid)
 	nv_wr32(dev, NV10_PFIFO_CACHE1_DMA_SUBROUTINE, nv_ri32(dev, fc + 48));
 
 out:
-	dev_priv->engine.instmem.finish_access(dev);
-
 	nv_wr32(dev, NV03_PFIFO_CACHE1_GET, 0);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUT, 0);
 }
@@ -155,8 +150,6 @@ nv10_fifo_unload_context(struct drm_device *dev)
 		return 0;
 	fc = NV10_RAMFC(chid);
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
-
 	nv_wi32(dev, fc +  0, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_PUT));
 	nv_wi32(dev, fc +  4, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_GET));
 	nv_wi32(dev, fc +  8, nv_rd32(dev, NV10_PFIFO_CACHE1_REF_CNT));
@@ -179,8 +172,6 @@ nv10_fifo_unload_context(struct drm_device *dev)
 	nv_wi32(dev, fc + 48, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_GET));
 
 out:
-	dev_priv->engine.instmem.finish_access(dev);
-
 	nv10_fifo_do_load_context(dev, pfifo->channels - 1);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUSH1, pfifo->channels - 1);
 	return 0;
@@ -212,14 +203,14 @@ nv10_fifo_init_ramxx(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
 	nv_wr32(dev, NV03_PFIFO_RAMHT, (0x03 << 24) /* search 128 */ |
-				       ((dev_priv->ramht_bits - 9) << 16) |
-				       (dev_priv->ramht_offset >> 8));
-	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro_offset>>8);
+				       ((dev_priv->ramht->bits - 9) << 16) |
+				       (dev_priv->ramht->gpuobj->pinst >> 8));
+	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro->pinst >> 8);
 
 	if (dev_priv->chipset < 0x17) {
-		nv_wr32(dev, NV03_PFIFO_RAMFC, dev_priv->ramfc_offset >> 8);
+		nv_wr32(dev, NV03_PFIFO_RAMFC, dev_priv->ramfc->pinst >> 8);
 	} else {
-		nv_wr32(dev, NV03_PFIFO_RAMFC, (dev_priv->ramfc_offset >> 8) |
+		nv_wr32(dev, NV03_PFIFO_RAMFC, (dev_priv->ramfc->pinst >> 8) |
 					       (1 << 16) /* 64 Bytes entry*/);
 		/* XXX nvidia blob set bit 18, 21,23 for nv20 & nv30 */
 	}
diff --git a/drivers/gpu/drm/nouveau/nv10_gpio.c b/drivers/gpu/drm/nouveau/nv10_gpio.c
new file mode 100644
index 0000000..007fc29
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nv10_gpio.c
@@ -0,0 +1,92 @@
+/*
+ * Copyright (C) 2009 Francisco Jerez.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
+ * a copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial
+ * portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "drmP.h"
+#include "nouveau_drv.h"
+#include "nouveau_hw.h"
+
+static bool
+get_gpio_location(struct dcb_gpio_entry *ent, uint32_t *reg, uint32_t *shift,
+		  uint32_t *mask)
+{
+	if (ent->line < 2) {
+		*reg = NV_PCRTC_GPIO;
+		*shift = ent->line * 16;
+		*mask = 0x11;
+
+	} else if (ent->line < 10) {
+		*reg = NV_PCRTC_GPIO_EXT;
+		*shift = (ent->line - 2) * 4;
+		*mask = 0x3;
+
+	} else if (ent->line < 14) {
+		*reg = NV_PCRTC_850;
+		*shift = (ent->line - 10) * 4;
+		*mask = 0x3;
+
+	} else {
+		return false;
+	}
+
+	return true;
+}
+
+int
+nv10_gpio_get(struct drm_device *dev, enum dcb_gpio_tag tag)
+{
+	struct dcb_gpio_entry *ent = nouveau_bios_gpio_entry(dev, tag);
+	uint32_t reg, shift, mask, value;
+
+	if (!ent)
+		return -ENODEV;
+
+	if (!get_gpio_location(ent, &reg, &shift, &mask))
+		return -ENODEV;
+
+	value = NVReadCRTC(dev, 0, reg) >> shift;
+
+	return (ent->invert ? 1 : 0) ^ (value & 1);
+}
+
+int
+nv10_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state)
+{
+	struct dcb_gpio_entry *ent = nouveau_bios_gpio_entry(dev, tag);
+	uint32_t reg, shift, mask, value;
+
+	if (!ent)
+		return -ENODEV;
+
+	if (!get_gpio_location(ent, &reg, &shift, &mask))
+		return -ENODEV;
+
+	value = ((ent->invert ? 1 : 0) ^ (state ? 1 : 0)) << shift;
+	mask = ~(mask << shift);
+
+	NVWriteCRTC(dev, 0, reg, value | (NVReadCRTC(dev, 0, reg) & mask));
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nv10_graph.c b/drivers/gpu/drm/nouveau/nv10_graph.c
index fcf2cdd..8e68c97 100644
--- a/drivers/gpu/drm/nouveau/nv10_graph.c
+++ b/drivers/gpu/drm/nouveau/nv10_graph.c
@@ -43,51 +43,51 @@ struct pipe_state {
 };
 
 static int nv10_graph_ctx_regs[] = {
-	NV10_PGRAPH_CTX_SWITCH1,
-	NV10_PGRAPH_CTX_SWITCH2,
-	NV10_PGRAPH_CTX_SWITCH3,
-	NV10_PGRAPH_CTX_SWITCH4,
-	NV10_PGRAPH_CTX_SWITCH5,
-	NV10_PGRAPH_CTX_CACHE1,	/* 8 values from 0x400160 to 0x40017c */
-	NV10_PGRAPH_CTX_CACHE2,	/* 8 values from 0x400180 to 0x40019c */
-	NV10_PGRAPH_CTX_CACHE3,	/* 8 values from 0x4001a0 to 0x4001bc */
-	NV10_PGRAPH_CTX_CACHE4,	/* 8 values from 0x4001c0 to 0x4001dc */
-	NV10_PGRAPH_CTX_CACHE5,	/* 8 values from 0x4001e0 to 0x4001fc */
-	0x00400164,
-	0x00400184,
-	0x004001a4,
-	0x004001c4,
-	0x004001e4,
-	0x00400168,
-	0x00400188,
-	0x004001a8,
-	0x004001c8,
-	0x004001e8,
-	0x0040016c,
-	0x0040018c,
-	0x004001ac,
-	0x004001cc,
-	0x004001ec,
-	0x00400170,
-	0x00400190,
-	0x004001b0,
-	0x004001d0,
-	0x004001f0,
-	0x00400174,
-	0x00400194,
-	0x004001b4,
-	0x004001d4,
-	0x004001f4,
-	0x00400178,
-	0x00400198,
-	0x004001b8,
-	0x004001d8,
-	0x004001f8,
-	0x0040017c,
-	0x0040019c,
-	0x004001bc,
-	0x004001dc,
-	0x004001fc,
+	NV10_PGRAPH_CTX_SWITCH(0),
+	NV10_PGRAPH_CTX_SWITCH(1),
+	NV10_PGRAPH_CTX_SWITCH(2),
+	NV10_PGRAPH_CTX_SWITCH(3),
+	NV10_PGRAPH_CTX_SWITCH(4),
+	NV10_PGRAPH_CTX_CACHE(0, 0),
+	NV10_PGRAPH_CTX_CACHE(0, 1),
+	NV10_PGRAPH_CTX_CACHE(0, 2),
+	NV10_PGRAPH_CTX_CACHE(0, 3),
+	NV10_PGRAPH_CTX_CACHE(0, 4),
+	NV10_PGRAPH_CTX_CACHE(1, 0),
+	NV10_PGRAPH_CTX_CACHE(1, 1),
+	NV10_PGRAPH_CTX_CACHE(1, 2),
+	NV10_PGRAPH_CTX_CACHE(1, 3),
+	NV10_PGRAPH_CTX_CACHE(1, 4),
+	NV10_PGRAPH_CTX_CACHE(2, 0),
+	NV10_PGRAPH_CTX_CACHE(2, 1),
+	NV10_PGRAPH_CTX_CACHE(2, 2),
+	NV10_PGRAPH_CTX_CACHE(2, 3),
+	NV10_PGRAPH_CTX_CACHE(2, 4),
+	NV10_PGRAPH_CTX_CACHE(3, 0),
+	NV10_PGRAPH_CTX_CACHE(3, 1),
+	NV10_PGRAPH_CTX_CACHE(3, 2),
+	NV10_PGRAPH_CTX_CACHE(3, 3),
+	NV10_PGRAPH_CTX_CACHE(3, 4),
+	NV10_PGRAPH_CTX_CACHE(4, 0),
+	NV10_PGRAPH_CTX_CACHE(4, 1),
+	NV10_PGRAPH_CTX_CACHE(4, 2),
+	NV10_PGRAPH_CTX_CACHE(4, 3),
+	NV10_PGRAPH_CTX_CACHE(4, 4),
+	NV10_PGRAPH_CTX_CACHE(5, 0),
+	NV10_PGRAPH_CTX_CACHE(5, 1),
+	NV10_PGRAPH_CTX_CACHE(5, 2),
+	NV10_PGRAPH_CTX_CACHE(5, 3),
+	NV10_PGRAPH_CTX_CACHE(5, 4),
+	NV10_PGRAPH_CTX_CACHE(6, 0),
+	NV10_PGRAPH_CTX_CACHE(6, 1),
+	NV10_PGRAPH_CTX_CACHE(6, 2),
+	NV10_PGRAPH_CTX_CACHE(6, 3),
+	NV10_PGRAPH_CTX_CACHE(6, 4),
+	NV10_PGRAPH_CTX_CACHE(7, 0),
+	NV10_PGRAPH_CTX_CACHE(7, 1),
+	NV10_PGRAPH_CTX_CACHE(7, 2),
+	NV10_PGRAPH_CTX_CACHE(7, 3),
+	NV10_PGRAPH_CTX_CACHE(7, 4),
 	NV10_PGRAPH_CTX_USER,
 	NV04_PGRAPH_DMA_START_0,
 	NV04_PGRAPH_DMA_START_1,
@@ -653,6 +653,78 @@ static int nv17_graph_ctx_regs_find_offset(struct drm_device *dev, int reg)
 	return -1;
 }
 
+static void nv10_graph_load_dma_vtxbuf(struct nouveau_channel *chan,
+				       uint32_t inst)
+{
+	struct drm_device *dev = chan->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
+	uint32_t st2, st2_dl, st2_dh, fifo_ptr, fifo[0x60/4];
+	uint32_t ctx_user, ctx_switch[5];
+	int i, subchan = -1;
+
+	/* NV10TCL_DMA_VTXBUF (method 0x18c) modifies hidden state
+	 * that cannot be restored via MMIO. Do it through the FIFO
+	 * instead.
+	 */
+
+	/* Look for a celsius object */
+	for (i = 0; i < 8; i++) {
+		int class = nv_rd32(dev, NV10_PGRAPH_CTX_CACHE(i, 0)) & 0xfff;
+
+		if (class == 0x56 || class == 0x96 || class == 0x99) {
+			subchan = i;
+			break;
+		}
+	}
+
+	if (subchan < 0 || !inst)
+		return;
+
+	/* Save the current ctx object */
+	ctx_user = nv_rd32(dev, NV10_PGRAPH_CTX_USER);
+	for (i = 0; i < 5; i++)
+		ctx_switch[i] = nv_rd32(dev, NV10_PGRAPH_CTX_SWITCH(i));
+
+	/* Save the FIFO state */
+	st2 = nv_rd32(dev, NV10_PGRAPH_FFINTFC_ST2);
+	st2_dl = nv_rd32(dev, NV10_PGRAPH_FFINTFC_ST2_DL);
+	st2_dh = nv_rd32(dev, NV10_PGRAPH_FFINTFC_ST2_DH);
+	fifo_ptr = nv_rd32(dev, NV10_PGRAPH_FFINTFC_FIFO_PTR);
+
+	for (i = 0; i < ARRAY_SIZE(fifo); i++)
+		fifo[i] = nv_rd32(dev, 0x4007a0 + 4 * i);
+
+	/* Switch to the celsius subchannel */
+	for (i = 0; i < 5; i++)
+		nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(i),
+			nv_rd32(dev, NV10_PGRAPH_CTX_CACHE(subchan, i)));
+	nv_mask(dev, NV10_PGRAPH_CTX_USER, 0xe000, subchan << 13);
+
+	/* Inject NV10TCL_DMA_VTXBUF */
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_FIFO_PTR, 0);
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_ST2,
+		0x2c000000 | chan->id << 20 | subchan << 16 | 0x18c);
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_ST2_DL, inst);
+	nv_mask(dev, NV10_PGRAPH_CTX_CONTROL, 0, 0x10000);
+	pgraph->fifo_access(dev, true);
+	pgraph->fifo_access(dev, false);
+
+	/* Restore the FIFO state */
+	for (i = 0; i < ARRAY_SIZE(fifo); i++)
+		nv_wr32(dev, 0x4007a0 + 4 * i, fifo[i]);
+
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_FIFO_PTR, fifo_ptr);
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_ST2, st2);
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_ST2_DL, st2_dl);
+	nv_wr32(dev, NV10_PGRAPH_FFINTFC_ST2_DH, st2_dh);
+
+	/* Restore the current ctx object */
+	for (i = 0; i < 5; i++)
+		nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(i), ctx_switch[i]);
+	nv_wr32(dev, NV10_PGRAPH_CTX_USER, ctx_user);
+}
+
 int nv10_graph_load_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
@@ -670,6 +742,8 @@ int nv10_graph_load_context(struct nouveau_channel *chan)
 	}
 
 	nv10_graph_load_pipe(chan);
+	nv10_graph_load_dma_vtxbuf(chan, (nv_rd32(dev, NV10_PGRAPH_GLOBALSTATE1)
+					  & 0xffff));
 
 	nv_wr32(dev, NV10_PGRAPH_CTX_CONTROL, 0x10010100);
 	tmp = nv_rd32(dev, NV10_PGRAPH_CTX_USER);
@@ -729,7 +803,7 @@ nv10_graph_context_switch(struct drm_device *dev)
 	/* Load context for next channel */
 	chid = (nv_rd32(dev, NV04_PGRAPH_TRAPPED_ADDR) >> 20) & 0x1f;
 	chan = dev_priv->fifos[chid];
-	if (chan)
+	if (chan && chan->pgraph_ctx)
 		nv10_graph_load_context(chan);
 
 	pgraph->fifo_access(dev, true);
@@ -856,11 +930,12 @@ int nv10_graph_init(struct drm_device *dev)
 	for (i = 0; i < NV10_PFB_TILE__SIZE; i++)
 		nv10_graph_set_region_tiling(dev, i, 0, 0, 0);
 
-	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH1, 0x00000000);
-	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH2, 0x00000000);
-	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH3, 0x00000000);
-	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH4, 0x00000000);
-	nv_wr32(dev, NV10_PGRAPH_STATE      , 0xFFFFFFFF);
+	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(0), 0x00000000);
+	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(1), 0x00000000);
+	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(2), 0x00000000);
+	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(3), 0x00000000);
+	nv_wr32(dev, NV10_PGRAPH_CTX_SWITCH(4), 0x00000000);
+	nv_wr32(dev, NV10_PGRAPH_STATE, 0xFFFFFFFF);
 
 	tmp  = nv_rd32(dev, NV10_PGRAPH_CTX_USER) & 0x00ffffff;
 	tmp |= (dev_priv->engine.fifo.channels - 1) << 24;
diff --git a/drivers/gpu/drm/nouveau/nv17_gpio.c b/drivers/gpu/drm/nouveau/nv17_gpio.c
deleted file mode 100644
index 2e58c33..0000000
--- a/drivers/gpu/drm/nouveau/nv17_gpio.c
+++ /dev/null
@@ -1,92 +0,0 @@
-/*
- * Copyright (C) 2009 Francisco Jerez.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining
- * a copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sublicense, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- *
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial
- * portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-
-#include "drmP.h"
-#include "nouveau_drv.h"
-#include "nouveau_hw.h"
-
-static bool
-get_gpio_location(struct dcb_gpio_entry *ent, uint32_t *reg, uint32_t *shift,
-		  uint32_t *mask)
-{
-	if (ent->line < 2) {
-		*reg = NV_PCRTC_GPIO;
-		*shift = ent->line * 16;
-		*mask = 0x11;
-
-	} else if (ent->line < 10) {
-		*reg = NV_PCRTC_GPIO_EXT;
-		*shift = (ent->line - 2) * 4;
-		*mask = 0x3;
-
-	} else if (ent->line < 14) {
-		*reg = NV_PCRTC_850;
-		*shift = (ent->line - 10) * 4;
-		*mask = 0x3;
-
-	} else {
-		return false;
-	}
-
-	return true;
-}
-
-int
-nv17_gpio_get(struct drm_device *dev, enum dcb_gpio_tag tag)
-{
-	struct dcb_gpio_entry *ent = nouveau_bios_gpio_entry(dev, tag);
-	uint32_t reg, shift, mask, value;
-
-	if (!ent)
-		return -ENODEV;
-
-	if (!get_gpio_location(ent, &reg, &shift, &mask))
-		return -ENODEV;
-
-	value = NVReadCRTC(dev, 0, reg) >> shift;
-
-	return (ent->invert ? 1 : 0) ^ (value & 1);
-}
-
-int
-nv17_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state)
-{
-	struct dcb_gpio_entry *ent = nouveau_bios_gpio_entry(dev, tag);
-	uint32_t reg, shift, mask, value;
-
-	if (!ent)
-		return -ENODEV;
-
-	if (!get_gpio_location(ent, &reg, &shift, &mask))
-		return -ENODEV;
-
-	value = ((ent->invert ? 1 : 0) ^ (state ? 1 : 0)) << shift;
-	mask = ~(mask << shift);
-
-	NVWriteCRTC(dev, 0, reg, value | (NVReadCRTC(dev, 0, reg) & mask));
-
-	return 0;
-}
diff --git a/drivers/gpu/drm/nouveau/nv17_tv.c b/drivers/gpu/drm/nouveau/nv17_tv.c
index 74c8803..28119fd 100644
--- a/drivers/gpu/drm/nouveau/nv17_tv.c
+++ b/drivers/gpu/drm/nouveau/nv17_tv.c
@@ -37,6 +37,7 @@ static uint32_t nv42_tv_sample_load(struct drm_encoder *encoder)
 {
 	struct drm_device *dev = encoder->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_gpio_engine *gpio = &dev_priv->engine.gpio;
 	uint32_t testval, regoffset = nv04_dac_output_offset(encoder);
 	uint32_t gpio0, gpio1, fp_htotal, fp_hsync_start, fp_hsync_end,
 		fp_control, test_ctrl, dacclk, ctv_14, ctv_1c, ctv_6c;
@@ -52,8 +53,8 @@ static uint32_t nv42_tv_sample_load(struct drm_encoder *encoder)
 	head = (dacclk & 0x100) >> 8;
 
 	/* Save the previous state. */
-	gpio1 = nv17_gpio_get(dev, DCB_GPIO_TVDAC1);
-	gpio0 = nv17_gpio_get(dev, DCB_GPIO_TVDAC0);
+	gpio1 = gpio->get(dev, DCB_GPIO_TVDAC1);
+	gpio0 = gpio->get(dev, DCB_GPIO_TVDAC0);
 	fp_htotal = NVReadRAMDAC(dev, head, NV_PRAMDAC_FP_HTOTAL);
 	fp_hsync_start = NVReadRAMDAC(dev, head, NV_PRAMDAC_FP_HSYNC_START);
 	fp_hsync_end = NVReadRAMDAC(dev, head, NV_PRAMDAC_FP_HSYNC_END);
@@ -64,8 +65,8 @@ static uint32_t nv42_tv_sample_load(struct drm_encoder *encoder)
 	ctv_6c = NVReadRAMDAC(dev, head, 0x680c6c);
 
 	/* Prepare the DAC for load detection.  */
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC1, true);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC0, true);
+	gpio->set(dev, DCB_GPIO_TVDAC1, true);
+	gpio->set(dev, DCB_GPIO_TVDAC0, true);
 
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_FP_HTOTAL, 1343);
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_FP_HSYNC_START, 1047);
@@ -110,12 +111,31 @@ static uint32_t nv42_tv_sample_load(struct drm_encoder *encoder)
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_FP_HSYNC_END, fp_hsync_end);
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_FP_HSYNC_START, fp_hsync_start);
 	NVWriteRAMDAC(dev, head, NV_PRAMDAC_FP_HTOTAL, fp_htotal);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC1, gpio1);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC0, gpio0);
+	gpio->set(dev, DCB_GPIO_TVDAC1, gpio1);
+	gpio->set(dev, DCB_GPIO_TVDAC0, gpio0);
 
 	return sample;
 }
 
+static bool
+get_tv_detect_quirks(struct drm_device *dev, uint32_t *pin_mask)
+{
+	/* Zotac FX5200 */
+	if (nv_match_device(dev, 0x0322, 0x19da, 0x1035) ||
+	    nv_match_device(dev, 0x0322, 0x19da, 0x2035)) {
+		*pin_mask = 0xc;
+		return false;
+	}
+
+	/* MSI nForce2 IGP */
+	if (nv_match_device(dev, 0x01f0, 0x1462, 0x5710)) {
+		*pin_mask = 0xc;
+		return false;
+	}
+
+	return true;
+}
+
 static enum drm_connector_status
 nv17_tv_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 {
@@ -124,12 +144,20 @@ nv17_tv_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 	struct drm_mode_config *conf = &dev->mode_config;
 	struct nv17_tv_encoder *tv_enc = to_tv_enc(encoder);
 	struct dcb_entry *dcb = tv_enc->base.dcb;
+	bool reliable = get_tv_detect_quirks(dev, &tv_enc->pin_mask);
 
-	if (dev_priv->chipset == 0x42 ||
-	    dev_priv->chipset == 0x43)
-		tv_enc->pin_mask = nv42_tv_sample_load(encoder) >> 28 & 0xe;
-	else
-		tv_enc->pin_mask = nv17_dac_sample_load(encoder) >> 28 & 0xe;
+	if (nv04_dac_in_use(encoder))
+		return connector_status_disconnected;
+
+	if (reliable) {
+		if (dev_priv->chipset == 0x42 ||
+		    dev_priv->chipset == 0x43)
+			tv_enc->pin_mask =
+				nv42_tv_sample_load(encoder) >> 28 & 0xe;
+		else
+			tv_enc->pin_mask =
+				nv17_dac_sample_load(encoder) >> 28 & 0xe;
+	}
 
 	switch (tv_enc->pin_mask) {
 	case 0x2:
@@ -154,7 +182,9 @@ nv17_tv_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 					 conf->tv_subconnector_property,
 					 tv_enc->subconnector);
 
-	if (tv_enc->subconnector) {
+	if (!reliable) {
+		return connector_status_unknown;
+	} else if (tv_enc->subconnector) {
 		NV_INFO(dev, "Load detected on output %c\n",
 			'@' + ffs(dcb->or));
 		return connector_status_connected;
@@ -163,55 +193,56 @@ nv17_tv_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 	}
 }
 
-static const struct {
-	int hdisplay;
-	int vdisplay;
-} modes[] = {
-	{ 640, 400 },
-	{ 640, 480 },
-	{ 720, 480 },
-	{ 720, 576 },
-	{ 800, 600 },
-	{ 1024, 768 },
-	{ 1280, 720 },
-	{ 1280, 1024 },
-	{ 1920, 1080 }
-};
-
-static int nv17_tv_get_modes(struct drm_encoder *encoder,
-			     struct drm_connector *connector)
+static int nv17_tv_get_ld_modes(struct drm_encoder *encoder,
+				struct drm_connector *connector)
 {
 	struct nv17_tv_norm_params *tv_norm = get_tv_norm(encoder);
-	struct drm_display_mode *mode;
-	struct drm_display_mode *output_mode;
+	struct drm_display_mode *mode, *tv_mode;
 	int n = 0;
-	int i;
 
-	if (tv_norm->kind != CTV_ENC_MODE) {
-		struct drm_display_mode *tv_mode;
+	for (tv_mode = nv17_tv_modes; tv_mode->hdisplay; tv_mode++) {
+		mode = drm_mode_duplicate(encoder->dev, tv_mode);
 
-		for (tv_mode = nv17_tv_modes; tv_mode->hdisplay; tv_mode++) {
-			mode = drm_mode_duplicate(encoder->dev, tv_mode);
+		mode->clock = tv_norm->tv_enc_mode.vrefresh *
+			mode->htotal / 1000 *
+			mode->vtotal / 1000;
 
-			mode->clock = tv_norm->tv_enc_mode.vrefresh *
-						mode->htotal / 1000 *
-						mode->vtotal / 1000;
-
-			if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
-				mode->clock *= 2;
+		if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
+			mode->clock *= 2;
 
-			if (mode->hdisplay == tv_norm->tv_enc_mode.hdisplay &&
-			    mode->vdisplay == tv_norm->tv_enc_mode.vdisplay)
-				mode->type |= DRM_MODE_TYPE_PREFERRED;
+		if (mode->hdisplay == tv_norm->tv_enc_mode.hdisplay &&
+		    mode->vdisplay == tv_norm->tv_enc_mode.vdisplay)
+			mode->type |= DRM_MODE_TYPE_PREFERRED;
 
-			drm_mode_probed_add(connector, mode);
-			n++;
-		}
-		return n;
+		drm_mode_probed_add(connector, mode);
+		n++;
 	}
 
-	/* tv_norm->kind == CTV_ENC_MODE */
-	output_mode = &tv_norm->ctv_enc_mode.mode;
+	return n;
+}
+
+static int nv17_tv_get_hd_modes(struct drm_encoder *encoder,
+				struct drm_connector *connector)
+{
+	struct nv17_tv_norm_params *tv_norm = get_tv_norm(encoder);
+	struct drm_display_mode *output_mode = &tv_norm->ctv_enc_mode.mode;
+	struct drm_display_mode *mode;
+	const struct {
+		int hdisplay;
+		int vdisplay;
+	} modes[] = {
+		{ 640, 400 },
+		{ 640, 480 },
+		{ 720, 480 },
+		{ 720, 576 },
+		{ 800, 600 },
+		{ 1024, 768 },
+		{ 1280, 720 },
+		{ 1280, 1024 },
+		{ 1920, 1080 }
+	};
+	int i, n = 0;
+
 	for (i = 0; i < ARRAY_SIZE(modes); i++) {
 		if (modes[i].hdisplay > output_mode->hdisplay ||
 		    modes[i].vdisplay > output_mode->vdisplay)
@@ -221,11 +252,12 @@ static int nv17_tv_get_modes(struct drm_encoder *encoder,
 		    modes[i].vdisplay == output_mode->vdisplay) {
 			mode = drm_mode_duplicate(encoder->dev, output_mode);
 			mode->type |= DRM_MODE_TYPE_PREFERRED;
+
 		} else {
 			mode = drm_cvt_mode(encoder->dev, modes[i].hdisplay,
-				modes[i].vdisplay, 60, false,
-				output_mode->flags & DRM_MODE_FLAG_INTERLACE,
-				false);
+					    modes[i].vdisplay, 60, false,
+					    (output_mode->flags &
+					     DRM_MODE_FLAG_INTERLACE), false);
 		}
 
 		/* CVT modes are sometimes unsuitable... */
@@ -236,6 +268,7 @@ static int nv17_tv_get_modes(struct drm_encoder *encoder,
 					     - mode->hdisplay) * 9 / 10) & ~7;
 			mode->hsync_end = mode->hsync_start + 8;
 		}
+
 		if (output_mode->vdisplay >= 1024) {
 			mode->vtotal = output_mode->vtotal;
 			mode->vsync_start = output_mode->vsync_start;
@@ -246,9 +279,21 @@ static int nv17_tv_get_modes(struct drm_encoder *encoder,
 		drm_mode_probed_add(connector, mode);
 		n++;
 	}
+
 	return n;
 }
 
+static int nv17_tv_get_modes(struct drm_encoder *encoder,
+			     struct drm_connector *connector)
+{
+	struct nv17_tv_norm_params *tv_norm = get_tv_norm(encoder);
+
+	if (tv_norm->kind == CTV_ENC_MODE)
+		return nv17_tv_get_hd_modes(encoder, connector);
+	else
+		return nv17_tv_get_ld_modes(encoder, connector);
+}
+
 static int nv17_tv_mode_valid(struct drm_encoder *encoder,
 			      struct drm_display_mode *mode)
 {
@@ -296,6 +341,9 @@ static bool nv17_tv_mode_fixup(struct drm_encoder *encoder,
 {
 	struct nv17_tv_norm_params *tv_norm = get_tv_norm(encoder);
 
+	if (nv04_dac_in_use(encoder))
+		return false;
+
 	if (tv_norm->kind == CTV_ENC_MODE)
 		adjusted_mode->clock = tv_norm->ctv_enc_mode.mode.clock;
 	else
@@ -307,6 +355,8 @@ static bool nv17_tv_mode_fixup(struct drm_encoder *encoder,
 static void  nv17_tv_dpms(struct drm_encoder *encoder, int mode)
 {
 	struct drm_device *dev = encoder->dev;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_gpio_engine *gpio = &dev_priv->engine.gpio;
 	struct nv17_tv_state *regs = &to_tv_enc(encoder)->state;
 	struct nv17_tv_norm_params *tv_norm = get_tv_norm(encoder);
 
@@ -331,8 +381,8 @@ static void  nv17_tv_dpms(struct drm_encoder *encoder, int mode)
 
 	nv_load_ptv(dev, regs, 200);
 
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC1, mode == DRM_MODE_DPMS_ON);
-	nv17_gpio_set(dev, DCB_GPIO_TVDAC0, mode == DRM_MODE_DPMS_ON);
+	gpio->set(dev, DCB_GPIO_TVDAC1, mode == DRM_MODE_DPMS_ON);
+	gpio->set(dev, DCB_GPIO_TVDAC0, mode == DRM_MODE_DPMS_ON);
 
 	nv04_dac_update_dacclk(encoder, mode == DRM_MODE_DPMS_ON);
 }
@@ -373,15 +423,8 @@ static void nv17_tv_prepare(struct drm_encoder *encoder)
 
 	}
 
-	/* Some NV4x have unknown values (0x3f, 0x50, 0x54, 0x6b, 0x79, 0x7f)
-	 * at LCD__INDEX which we don't alter
-	 */
-	if (!(*cr_lcd & 0x44)) {
-		if (tv_norm->kind == CTV_ENC_MODE)
-			*cr_lcd = 0x1 | (head ? 0x0 : 0x8);
-		else
-			*cr_lcd = 0;
-	}
+	if (tv_norm->kind == CTV_ENC_MODE)
+		*cr_lcd |= 0x1 | (head ? 0x0 : 0x8);
 
 	/* Set the DACCLK register */
 	dacclk = (NVReadRAMDAC(dev, 0, dacclk_off) & ~0x30) | 0x1;
@@ -744,8 +787,10 @@ static struct drm_encoder_funcs nv17_tv_funcs = {
 	.destroy = nv17_tv_destroy,
 };
 
-int nv17_tv_create(struct drm_device *dev, struct dcb_entry *entry)
+int
+nv17_tv_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
+	struct drm_device *dev = connector->dev;
 	struct drm_encoder *encoder;
 	struct nv17_tv_encoder *tv_enc = NULL;
 
@@ -774,5 +819,7 @@ int nv17_tv_create(struct drm_device *dev, struct dcb_entry *entry)
 	encoder->possible_crtcs = entry->heads;
 	encoder->possible_clones = 0;
 
+	nv17_tv_create_resources(encoder, connector);
+	drm_mode_connector_attach_encoder(connector, encoder);
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nv17_tv.h b/drivers/gpu/drm/nouveau/nv17_tv.h
index c00977c..6bf0384 100644
--- a/drivers/gpu/drm/nouveau/nv17_tv.h
+++ b/drivers/gpu/drm/nouveau/nv17_tv.h
@@ -127,7 +127,8 @@ void nv17_ctv_update_rescaler(struct drm_encoder *encoder);
 
 /* TV hardware access functions */
 
-static inline void nv_write_ptv(struct drm_device *dev, uint32_t reg, uint32_t val)
+static inline void nv_write_ptv(struct drm_device *dev, uint32_t reg,
+				uint32_t val)
 {
 	nv_wr32(dev, reg, val);
 }
@@ -137,7 +138,8 @@ static inline uint32_t nv_read_ptv(struct drm_device *dev, uint32_t reg)
 	return nv_rd32(dev, reg);
 }
 
-static inline void nv_write_tv_enc(struct drm_device *dev, uint8_t reg, uint8_t val)
+static inline void nv_write_tv_enc(struct drm_device *dev, uint8_t reg,
+				   uint8_t val)
 {
 	nv_write_ptv(dev, NV_PTV_TV_INDEX, reg);
 	nv_write_ptv(dev, NV_PTV_TV_DATA, val);
@@ -149,8 +151,11 @@ static inline uint8_t nv_read_tv_enc(struct drm_device *dev, uint8_t reg)
 	return nv_read_ptv(dev, NV_PTV_TV_DATA);
 }
 
-#define nv_load_ptv(dev, state, reg) nv_write_ptv(dev, NV_PTV_OFFSET + 0x##reg, state->ptv_##reg)
-#define nv_save_ptv(dev, state, reg) state->ptv_##reg = nv_read_ptv(dev, NV_PTV_OFFSET + 0x##reg)
-#define nv_load_tv_enc(dev, state, reg) nv_write_tv_enc(dev, 0x##reg, state->tv_enc[0x##reg])
+#define nv_load_ptv(dev, state, reg) \
+	nv_write_ptv(dev, NV_PTV_OFFSET + 0x##reg, state->ptv_##reg)
+#define nv_save_ptv(dev, state, reg) \
+	state->ptv_##reg = nv_read_ptv(dev, NV_PTV_OFFSET + 0x##reg)
+#define nv_load_tv_enc(dev, state, reg) \
+	nv_write_tv_enc(dev, 0x##reg, state->tv_enc[0x##reg])
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nv17_tv_modes.c b/drivers/gpu/drm/nouveau/nv17_tv_modes.c
index d64683d..9d3893c 100644
--- a/drivers/gpu/drm/nouveau/nv17_tv_modes.c
+++ b/drivers/gpu/drm/nouveau/nv17_tv_modes.c
@@ -336,12 +336,17 @@ static void tv_setup_filter(struct drm_encoder *encoder)
 			struct filter_params *p = &fparams[k][j];
 
 			for (i = 0; i < 7; i++) {
-				int64_t c = (p->k1 + p->ki*i + p->ki2*i*i + p->ki3*i*i*i)
-					+ (p->kr + p->kir*i + p->ki2r*i*i + p->ki3r*i*i*i)*rs[k]
-					+ (p->kf + p->kif*i + p->ki2f*i*i + p->ki3f*i*i*i)*flicker
-					+ (p->krf + p->kirf*i + p->ki2rf*i*i + p->ki3rf*i*i*i)*flicker*rs[k];
-
-				(*filters[k])[j][i] = (c + id5/2) >> 39 & (0x1 << 31 | 0x7f << 9);
+				int64_t c = (p->k1 + p->ki*i + p->ki2*i*i +
+					     p->ki3*i*i*i)
+					+ (p->kr + p->kir*i + p->ki2r*i*i +
+					   p->ki3r*i*i*i) * rs[k]
+					+ (p->kf + p->kif*i + p->ki2f*i*i +
+					   p->ki3f*i*i*i) * flicker
+					+ (p->krf + p->kirf*i + p->ki2rf*i*i +
+					   p->ki3rf*i*i*i) * flicker * rs[k];
+
+				(*filters[k])[j][i] = (c + id5/2) >> 39
+					& (0x1 << 31 | 0x7f << 9);
 			}
 		}
 	}
@@ -349,7 +354,8 @@ static void tv_setup_filter(struct drm_encoder *encoder)
 
 /* Hardware state saving/restoring */
 
-static void tv_save_filter(struct drm_device *dev, uint32_t base, uint32_t regs[4][7])
+static void tv_save_filter(struct drm_device *dev, uint32_t base,
+			   uint32_t regs[4][7])
 {
 	int i, j;
 	uint32_t offsets[] = { base, base + 0x1c, base + 0x40, base + 0x5c };
@@ -360,7 +366,8 @@ static void tv_save_filter(struct drm_device *dev, uint32_t base, uint32_t regs[
 	}
 }
 
-static void tv_load_filter(struct drm_device *dev, uint32_t base, uint32_t regs[4][7])
+static void tv_load_filter(struct drm_device *dev, uint32_t base,
+			   uint32_t regs[4][7])
 {
 	int i, j;
 	uint32_t offsets[] = { base, base + 0x1c, base + 0x40, base + 0x5c };
@@ -504,10 +511,10 @@ void nv17_tv_update_properties(struct drm_encoder *encoder)
 		break;
 	}
 
-	regs->tv_enc[0x20] = interpolate(0, tv_norm->tv_enc_mode.tv_enc[0x20], 255,
-					 tv_enc->saturation);
-	regs->tv_enc[0x22] = interpolate(0, tv_norm->tv_enc_mode.tv_enc[0x22], 255,
-					 tv_enc->saturation);
+	regs->tv_enc[0x20] = interpolate(0, tv_norm->tv_enc_mode.tv_enc[0x20],
+					 255, tv_enc->saturation);
+	regs->tv_enc[0x22] = interpolate(0, tv_norm->tv_enc_mode.tv_enc[0x22],
+					 255, tv_enc->saturation);
 	regs->tv_enc[0x25] = tv_enc->hue * 255 / 100;
 
 	nv_load_ptv(dev, regs, 204);
@@ -541,7 +548,8 @@ void nv17_ctv_update_rescaler(struct drm_encoder *encoder)
 	int head = nouveau_crtc(encoder->crtc)->index;
 	struct nv04_crtc_reg *regs = &dev_priv->mode_reg.crtc_reg[head];
 	struct drm_display_mode *crtc_mode = &encoder->crtc->mode;
-	struct drm_display_mode *output_mode = &get_tv_norm(encoder)->ctv_enc_mode.mode;
+	struct drm_display_mode *output_mode =
+		&get_tv_norm(encoder)->ctv_enc_mode.mode;
 	int overscan, hmargin, vmargin, hratio, vratio;
 
 	/* The rescaler doesn't do the right thing for interlaced modes. */
@@ -553,13 +561,15 @@ void nv17_ctv_update_rescaler(struct drm_encoder *encoder)
 	hmargin = (output_mode->hdisplay - crtc_mode->hdisplay) / 2;
 	vmargin = (output_mode->vdisplay - crtc_mode->vdisplay) / 2;
 
-	hmargin = interpolate(0, min(hmargin, output_mode->hdisplay/20), hmargin,
-			      overscan);
-	vmargin = interpolate(0, min(vmargin, output_mode->vdisplay/20), vmargin,
-			      overscan);
+	hmargin = interpolate(0, min(hmargin, output_mode->hdisplay/20),
+			      hmargin, overscan);
+	vmargin = interpolate(0, min(vmargin, output_mode->vdisplay/20),
+			      vmargin, overscan);
 
-	hratio = crtc_mode->hdisplay * 0x800 / (output_mode->hdisplay - 2*hmargin);
-	vratio = crtc_mode->vdisplay * 0x800 / (output_mode->vdisplay - 2*vmargin) & ~3;
+	hratio = crtc_mode->hdisplay * 0x800 /
+		(output_mode->hdisplay - 2*hmargin);
+	vratio = crtc_mode->vdisplay * 0x800 /
+		(output_mode->vdisplay - 2*vmargin) & ~3;
 
 	regs->fp_horiz_regs[FP_VALID_START] = hmargin;
 	regs->fp_horiz_regs[FP_VALID_END] = output_mode->hdisplay - hmargin - 1;
diff --git a/drivers/gpu/drm/nouveau/nv20_graph.c b/drivers/gpu/drm/nouveau/nv20_graph.c
index d6fc0a8..93f0d8a 100644
--- a/drivers/gpu/drm/nouveau/nv20_graph.c
+++ b/drivers/gpu/drm/nouveau/nv20_graph.c
@@ -37,49 +37,49 @@ nv20_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x033c/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x03a0/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x03a4/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x047c/4, 0x00000101);
-	nv_wo32(dev, ctx, 0x0490/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x04a8/4, 0x44400000);
+	nv_wo32(ctx, 0x033c, 0xffff0000);
+	nv_wo32(ctx, 0x03a0, 0x0fff0000);
+	nv_wo32(ctx, 0x03a4, 0x0fff0000);
+	nv_wo32(ctx, 0x047c, 0x00000101);
+	nv_wo32(ctx, 0x0490, 0x00000111);
+	nv_wo32(ctx, 0x04a8, 0x44400000);
 	for (i = 0x04d4; i <= 0x04e0; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x04f4; i <= 0x0500; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080000);
+		nv_wo32(ctx, i, 0x00080000);
 	for (i = 0x050c; i <= 0x0518; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x051c; i <= 0x0528; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x000105b8);
+		nv_wo32(ctx, i, 0x000105b8);
 	for (i = 0x052c; i <= 0x0538; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
+		nv_wo32(ctx, i, 0x00080008);
 	for (i = 0x055c; i <= 0x0598; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x05a4/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x05fc/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0604/4, 0x00004000);
-	nv_wo32(dev, ctx, 0x0610/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0618/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x061c/4, 0x00010000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x05a4, 0x4b7fffff);
+	nv_wo32(ctx, 0x05fc, 0x00000001);
+	nv_wo32(ctx, 0x0604, 0x00004000);
+	nv_wo32(ctx, 0x0610, 0x00000001);
+	nv_wo32(ctx, 0x0618, 0x00040000);
+	nv_wo32(ctx, 0x061c, 0x00010000);
 	for (i = 0x1c1c; i <= 0x248c; i += 16) {
-		nv_wo32(dev, ctx, (i + 0)/4, 0x10700ff9);
-		nv_wo32(dev, ctx, (i + 4)/4, 0x0436086c);
-		nv_wo32(dev, ctx, (i + 8)/4, 0x000c001b);
+		nv_wo32(ctx, (i + 0), 0x10700ff9);
+		nv_wo32(ctx, (i + 4), 0x0436086c);
+		nv_wo32(ctx, (i + 8), 0x000c001b);
 	}
-	nv_wo32(dev, ctx, 0x281c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2830/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x285c/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2860/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2864/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x286c/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2870/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2878/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x2880/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x34a4/4, 0x000fe000);
-	nv_wo32(dev, ctx, 0x3530/4, 0x000003f8);
-	nv_wo32(dev, ctx, 0x3540/4, 0x002fe000);
+	nv_wo32(ctx, 0x281c, 0x3f800000);
+	nv_wo32(ctx, 0x2830, 0x3f800000);
+	nv_wo32(ctx, 0x285c, 0x40000000);
+	nv_wo32(ctx, 0x2860, 0x3f800000);
+	nv_wo32(ctx, 0x2864, 0x3f000000);
+	nv_wo32(ctx, 0x286c, 0x40000000);
+	nv_wo32(ctx, 0x2870, 0x3f800000);
+	nv_wo32(ctx, 0x2878, 0xbf800000);
+	nv_wo32(ctx, 0x2880, 0xbf800000);
+	nv_wo32(ctx, 0x34a4, 0x000fe000);
+	nv_wo32(ctx, 0x3530, 0x000003f8);
+	nv_wo32(ctx, 0x3540, 0x002fe000);
 	for (i = 0x355c; i <= 0x3578; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x001c527c);
+		nv_wo32(ctx, i, 0x001c527c);
 }
 
 static void
@@ -87,58 +87,58 @@ nv25_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x035c/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x03c0/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x03c4/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x049c/4, 0x00000101);
-	nv_wo32(dev, ctx, 0x04b0/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x04c8/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x04cc/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x04d0/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x04e4/4, 0x44400000);
-	nv_wo32(dev, ctx, 0x04fc/4, 0x4b800000);
+	nv_wo32(ctx, 0x035c, 0xffff0000);
+	nv_wo32(ctx, 0x03c0, 0x0fff0000);
+	nv_wo32(ctx, 0x03c4, 0x0fff0000);
+	nv_wo32(ctx, 0x049c, 0x00000101);
+	nv_wo32(ctx, 0x04b0, 0x00000111);
+	nv_wo32(ctx, 0x04c8, 0x00000080);
+	nv_wo32(ctx, 0x04cc, 0xffff0000);
+	nv_wo32(ctx, 0x04d0, 0x00000001);
+	nv_wo32(ctx, 0x04e4, 0x44400000);
+	nv_wo32(ctx, 0x04fc, 0x4b800000);
 	for (i = 0x0510; i <= 0x051c; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x0530; i <= 0x053c; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080000);
+		nv_wo32(ctx, i, 0x00080000);
 	for (i = 0x0548; i <= 0x0554; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x0558; i <= 0x0564; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x000105b8);
+		nv_wo32(ctx, i, 0x000105b8);
 	for (i = 0x0568; i <= 0x0574; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
+		nv_wo32(ctx, i, 0x00080008);
 	for (i = 0x0598; i <= 0x05d4; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x05e0/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x0620/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0624/4, 0x30201000);
-	nv_wo32(dev, ctx, 0x0628/4, 0x70605040);
-	nv_wo32(dev, ctx, 0x062c/4, 0xb0a09080);
-	nv_wo32(dev, ctx, 0x0630/4, 0xf0e0d0c0);
-	nv_wo32(dev, ctx, 0x0664/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x066c/4, 0x00004000);
-	nv_wo32(dev, ctx, 0x0678/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0680/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x0684/4, 0x00010000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x05e0, 0x4b7fffff);
+	nv_wo32(ctx, 0x0620, 0x00000080);
+	nv_wo32(ctx, 0x0624, 0x30201000);
+	nv_wo32(ctx, 0x0628, 0x70605040);
+	nv_wo32(ctx, 0x062c, 0xb0a09080);
+	nv_wo32(ctx, 0x0630, 0xf0e0d0c0);
+	nv_wo32(ctx, 0x0664, 0x00000001);
+	nv_wo32(ctx, 0x066c, 0x00004000);
+	nv_wo32(ctx, 0x0678, 0x00000001);
+	nv_wo32(ctx, 0x0680, 0x00040000);
+	nv_wo32(ctx, 0x0684, 0x00010000);
 	for (i = 0x1b04; i <= 0x2374; i += 16) {
-		nv_wo32(dev, ctx, (i + 0)/4, 0x10700ff9);
-		nv_wo32(dev, ctx, (i + 4)/4, 0x0436086c);
-		nv_wo32(dev, ctx, (i + 8)/4, 0x000c001b);
+		nv_wo32(ctx, (i + 0), 0x10700ff9);
+		nv_wo32(ctx, (i + 4), 0x0436086c);
+		nv_wo32(ctx, (i + 8), 0x000c001b);
 	}
-	nv_wo32(dev, ctx, 0x2704/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2718/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2744/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2748/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x274c/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x2754/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2758/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2760/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x2768/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x308c/4, 0x000fe000);
-	nv_wo32(dev, ctx, 0x3108/4, 0x000003f8);
-	nv_wo32(dev, ctx, 0x3468/4, 0x002fe000);
+	nv_wo32(ctx, 0x2704, 0x3f800000);
+	nv_wo32(ctx, 0x2718, 0x3f800000);
+	nv_wo32(ctx, 0x2744, 0x40000000);
+	nv_wo32(ctx, 0x2748, 0x3f800000);
+	nv_wo32(ctx, 0x274c, 0x3f000000);
+	nv_wo32(ctx, 0x2754, 0x40000000);
+	nv_wo32(ctx, 0x2758, 0x3f800000);
+	nv_wo32(ctx, 0x2760, 0xbf800000);
+	nv_wo32(ctx, 0x2768, 0xbf800000);
+	nv_wo32(ctx, 0x308c, 0x000fe000);
+	nv_wo32(ctx, 0x3108, 0x000003f8);
+	nv_wo32(ctx, 0x3468, 0x002fe000);
 	for (i = 0x3484; i <= 0x34a0; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x001c527c);
+		nv_wo32(ctx, i, 0x001c527c);
 }
 
 static void
@@ -146,49 +146,49 @@ nv2a_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x033c/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x03a0/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x03a4/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x047c/4, 0x00000101);
-	nv_wo32(dev, ctx, 0x0490/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x04a8/4, 0x44400000);
+	nv_wo32(ctx, 0x033c, 0xffff0000);
+	nv_wo32(ctx, 0x03a0, 0x0fff0000);
+	nv_wo32(ctx, 0x03a4, 0x0fff0000);
+	nv_wo32(ctx, 0x047c, 0x00000101);
+	nv_wo32(ctx, 0x0490, 0x00000111);
+	nv_wo32(ctx, 0x04a8, 0x44400000);
 	for (i = 0x04d4; i <= 0x04e0; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x04f4; i <= 0x0500; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080000);
+		nv_wo32(ctx, i, 0x00080000);
 	for (i = 0x050c; i <= 0x0518; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x051c; i <= 0x0528; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x000105b8);
+		nv_wo32(ctx, i, 0x000105b8);
 	for (i = 0x052c; i <= 0x0538; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
+		nv_wo32(ctx, i, 0x00080008);
 	for (i = 0x055c; i <= 0x0598; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x05a4/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x05fc/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0604/4, 0x00004000);
-	nv_wo32(dev, ctx, 0x0610/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0618/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x061c/4, 0x00010000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x05a4, 0x4b7fffff);
+	nv_wo32(ctx, 0x05fc, 0x00000001);
+	nv_wo32(ctx, 0x0604, 0x00004000);
+	nv_wo32(ctx, 0x0610, 0x00000001);
+	nv_wo32(ctx, 0x0618, 0x00040000);
+	nv_wo32(ctx, 0x061c, 0x00010000);
 	for (i = 0x1a9c; i <= 0x22fc; i += 16) { /*XXX: check!! */
-		nv_wo32(dev, ctx, (i + 0)/4, 0x10700ff9);
-		nv_wo32(dev, ctx, (i + 4)/4, 0x0436086c);
-		nv_wo32(dev, ctx, (i + 8)/4, 0x000c001b);
+		nv_wo32(ctx, (i + 0), 0x10700ff9);
+		nv_wo32(ctx, (i + 4), 0x0436086c);
+		nv_wo32(ctx, (i + 8), 0x000c001b);
 	}
-	nv_wo32(dev, ctx, 0x269c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x26b0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x26dc/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x26e0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x26e4/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x26ec/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x26f0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x26f8/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x2700/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x3024/4, 0x000fe000);
-	nv_wo32(dev, ctx, 0x30a0/4, 0x000003f8);
-	nv_wo32(dev, ctx, 0x33fc/4, 0x002fe000);
+	nv_wo32(ctx, 0x269c, 0x3f800000);
+	nv_wo32(ctx, 0x26b0, 0x3f800000);
+	nv_wo32(ctx, 0x26dc, 0x40000000);
+	nv_wo32(ctx, 0x26e0, 0x3f800000);
+	nv_wo32(ctx, 0x26e4, 0x3f000000);
+	nv_wo32(ctx, 0x26ec, 0x40000000);
+	nv_wo32(ctx, 0x26f0, 0x3f800000);
+	nv_wo32(ctx, 0x26f8, 0xbf800000);
+	nv_wo32(ctx, 0x2700, 0xbf800000);
+	nv_wo32(ctx, 0x3024, 0x000fe000);
+	nv_wo32(ctx, 0x30a0, 0x000003f8);
+	nv_wo32(ctx, 0x33fc, 0x002fe000);
 	for (i = 0x341c; i <= 0x3438; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x001c527c);
+		nv_wo32(ctx, i, 0x001c527c);
 }
 
 static void
@@ -196,57 +196,57 @@ nv30_31_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x0410/4, 0x00000101);
-	nv_wo32(dev, ctx, 0x0424/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x0428/4, 0x00000060);
-	nv_wo32(dev, ctx, 0x0444/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0448/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x044c/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x0460/4, 0x44400000);
-	nv_wo32(dev, ctx, 0x048c/4, 0xffff0000);
+	nv_wo32(ctx, 0x0410, 0x00000101);
+	nv_wo32(ctx, 0x0424, 0x00000111);
+	nv_wo32(ctx, 0x0428, 0x00000060);
+	nv_wo32(ctx, 0x0444, 0x00000080);
+	nv_wo32(ctx, 0x0448, 0xffff0000);
+	nv_wo32(ctx, 0x044c, 0x00000001);
+	nv_wo32(ctx, 0x0460, 0x44400000);
+	nv_wo32(ctx, 0x048c, 0xffff0000);
 	for (i = 0x04e0; i < 0x04e8; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x04ec/4, 0x00011100);
+		nv_wo32(ctx, i, 0x0fff0000);
+	nv_wo32(ctx, 0x04ec, 0x00011100);
 	for (i = 0x0508; i < 0x0548; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x0550/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x058c/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0590/4, 0x30201000);
-	nv_wo32(dev, ctx, 0x0594/4, 0x70605040);
-	nv_wo32(dev, ctx, 0x0598/4, 0xb8a89888);
-	nv_wo32(dev, ctx, 0x059c/4, 0xf8e8d8c8);
-	nv_wo32(dev, ctx, 0x05b0/4, 0xb0000000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x0550, 0x4b7fffff);
+	nv_wo32(ctx, 0x058c, 0x00000080);
+	nv_wo32(ctx, 0x0590, 0x30201000);
+	nv_wo32(ctx, 0x0594, 0x70605040);
+	nv_wo32(ctx, 0x0598, 0xb8a89888);
+	nv_wo32(ctx, 0x059c, 0xf8e8d8c8);
+	nv_wo32(ctx, 0x05b0, 0xb0000000);
 	for (i = 0x0600; i < 0x0640; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00010588);
+		nv_wo32(ctx, i, 0x00010588);
 	for (i = 0x0640; i < 0x0680; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x06c0; i < 0x0700; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0008aae4);
+		nv_wo32(ctx, i, 0x0008aae4);
 	for (i = 0x0700; i < 0x0740; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x0740; i < 0x0780; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
-	nv_wo32(dev, ctx, 0x085c/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x0860/4, 0x00010000);
+		nv_wo32(ctx, i, 0x00080008);
+	nv_wo32(ctx, 0x085c, 0x00040000);
+	nv_wo32(ctx, 0x0860, 0x00010000);
 	for (i = 0x0864; i < 0x0874; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00040004);
+		nv_wo32(ctx, i, 0x00040004);
 	for (i = 0x1f18; i <= 0x3088 ; i += 16) {
-		nv_wo32(dev, ctx, i/4 + 0, 0x10700ff9);
-		nv_wo32(dev, ctx, i/4 + 1, 0x0436086c);
-		nv_wo32(dev, ctx, i/4 + 2, 0x000c001b);
+		nv_wo32(ctx, i + 0, 0x10700ff9);
+		nv_wo32(ctx, i + 1, 0x0436086c);
+		nv_wo32(ctx, i + 2, 0x000c001b);
 	}
 	for (i = 0x30b8; i < 0x30c8; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0000ffff);
-	nv_wo32(dev, ctx, 0x344c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3808/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x381c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3848/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x384c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3850/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x3858/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x385c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3864/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x386c/4, 0xbf800000);
+		nv_wo32(ctx, i, 0x0000ffff);
+	nv_wo32(ctx, 0x344c, 0x3f800000);
+	nv_wo32(ctx, 0x3808, 0x3f800000);
+	nv_wo32(ctx, 0x381c, 0x3f800000);
+	nv_wo32(ctx, 0x3848, 0x40000000);
+	nv_wo32(ctx, 0x384c, 0x3f800000);
+	nv_wo32(ctx, 0x3850, 0x3f000000);
+	nv_wo32(ctx, 0x3858, 0x40000000);
+	nv_wo32(ctx, 0x385c, 0x3f800000);
+	nv_wo32(ctx, 0x3864, 0xbf800000);
+	nv_wo32(ctx, 0x386c, 0xbf800000);
 }
 
 static void
@@ -254,57 +254,57 @@ nv34_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x040c/4, 0x01000101);
-	nv_wo32(dev, ctx, 0x0420/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x0424/4, 0x00000060);
-	nv_wo32(dev, ctx, 0x0440/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0444/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x0448/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x045c/4, 0x44400000);
-	nv_wo32(dev, ctx, 0x0480/4, 0xffff0000);
+	nv_wo32(ctx, 0x040c, 0x01000101);
+	nv_wo32(ctx, 0x0420, 0x00000111);
+	nv_wo32(ctx, 0x0424, 0x00000060);
+	nv_wo32(ctx, 0x0440, 0x00000080);
+	nv_wo32(ctx, 0x0444, 0xffff0000);
+	nv_wo32(ctx, 0x0448, 0x00000001);
+	nv_wo32(ctx, 0x045c, 0x44400000);
+	nv_wo32(ctx, 0x0480, 0xffff0000);
 	for (i = 0x04d4; i < 0x04dc; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x04e0/4, 0x00011100);
+		nv_wo32(ctx, i, 0x0fff0000);
+	nv_wo32(ctx, 0x04e0, 0x00011100);
 	for (i = 0x04fc; i < 0x053c; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x0544/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x057c/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0580/4, 0x30201000);
-	nv_wo32(dev, ctx, 0x0584/4, 0x70605040);
-	nv_wo32(dev, ctx, 0x0588/4, 0xb8a89888);
-	nv_wo32(dev, ctx, 0x058c/4, 0xf8e8d8c8);
-	nv_wo32(dev, ctx, 0x05a0/4, 0xb0000000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x0544, 0x4b7fffff);
+	nv_wo32(ctx, 0x057c, 0x00000080);
+	nv_wo32(ctx, 0x0580, 0x30201000);
+	nv_wo32(ctx, 0x0584, 0x70605040);
+	nv_wo32(ctx, 0x0588, 0xb8a89888);
+	nv_wo32(ctx, 0x058c, 0xf8e8d8c8);
+	nv_wo32(ctx, 0x05a0, 0xb0000000);
 	for (i = 0x05f0; i < 0x0630; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00010588);
+		nv_wo32(ctx, i, 0x00010588);
 	for (i = 0x0630; i < 0x0670; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x06b0; i < 0x06f0; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0008aae4);
+		nv_wo32(ctx, i, 0x0008aae4);
 	for (i = 0x06f0; i < 0x0730; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x0730; i < 0x0770; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
-	nv_wo32(dev, ctx, 0x0850/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x0854/4, 0x00010000);
+		nv_wo32(ctx, i, 0x00080008);
+	nv_wo32(ctx, 0x0850, 0x00040000);
+	nv_wo32(ctx, 0x0854, 0x00010000);
 	for (i = 0x0858; i < 0x0868; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00040004);
+		nv_wo32(ctx, i, 0x00040004);
 	for (i = 0x15ac; i <= 0x271c ; i += 16) {
-		nv_wo32(dev, ctx, i/4 + 0, 0x10700ff9);
-		nv_wo32(dev, ctx, i/4 + 1, 0x0436086c);
-		nv_wo32(dev, ctx, i/4 + 2, 0x000c001b);
+		nv_wo32(ctx, i + 0, 0x10700ff9);
+		nv_wo32(ctx, i + 1, 0x0436086c);
+		nv_wo32(ctx, i + 2, 0x000c001b);
 	}
 	for (i = 0x274c; i < 0x275c; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0000ffff);
-	nv_wo32(dev, ctx, 0x2ae0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2e9c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2eb0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2edc/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2ee0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2ee4/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x2eec/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x2ef0/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x2ef8/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x2f00/4, 0xbf800000);
+		nv_wo32(ctx, i, 0x0000ffff);
+	nv_wo32(ctx, 0x2ae0, 0x3f800000);
+	nv_wo32(ctx, 0x2e9c, 0x3f800000);
+	nv_wo32(ctx, 0x2eb0, 0x3f800000);
+	nv_wo32(ctx, 0x2edc, 0x40000000);
+	nv_wo32(ctx, 0x2ee0, 0x3f800000);
+	nv_wo32(ctx, 0x2ee4, 0x3f000000);
+	nv_wo32(ctx, 0x2eec, 0x40000000);
+	nv_wo32(ctx, 0x2ef0, 0x3f800000);
+	nv_wo32(ctx, 0x2ef8, 0xbf800000);
+	nv_wo32(ctx, 0x2f00, 0xbf800000);
 }
 
 static void
@@ -312,57 +312,57 @@ nv35_36_graph_context_init(struct drm_device *dev, struct nouveau_gpuobj *ctx)
 {
 	int i;
 
-	nv_wo32(dev, ctx, 0x040c/4, 0x00000101);
-	nv_wo32(dev, ctx, 0x0420/4, 0x00000111);
-	nv_wo32(dev, ctx, 0x0424/4, 0x00000060);
-	nv_wo32(dev, ctx, 0x0440/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x0444/4, 0xffff0000);
-	nv_wo32(dev, ctx, 0x0448/4, 0x00000001);
-	nv_wo32(dev, ctx, 0x045c/4, 0x44400000);
-	nv_wo32(dev, ctx, 0x0488/4, 0xffff0000);
+	nv_wo32(ctx, 0x040c, 0x00000101);
+	nv_wo32(ctx, 0x0420, 0x00000111);
+	nv_wo32(ctx, 0x0424, 0x00000060);
+	nv_wo32(ctx, 0x0440, 0x00000080);
+	nv_wo32(ctx, 0x0444, 0xffff0000);
+	nv_wo32(ctx, 0x0448, 0x00000001);
+	nv_wo32(ctx, 0x045c, 0x44400000);
+	nv_wo32(ctx, 0x0488, 0xffff0000);
 	for (i = 0x04dc; i < 0x04e4; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0fff0000);
-	nv_wo32(dev, ctx, 0x04e8/4, 0x00011100);
+		nv_wo32(ctx, i, 0x0fff0000);
+	nv_wo32(ctx, 0x04e8, 0x00011100);
 	for (i = 0x0504; i < 0x0544; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x07ff0000);
-	nv_wo32(dev, ctx, 0x054c/4, 0x4b7fffff);
-	nv_wo32(dev, ctx, 0x0588/4, 0x00000080);
-	nv_wo32(dev, ctx, 0x058c/4, 0x30201000);
-	nv_wo32(dev, ctx, 0x0590/4, 0x70605040);
-	nv_wo32(dev, ctx, 0x0594/4, 0xb8a89888);
-	nv_wo32(dev, ctx, 0x0598/4, 0xf8e8d8c8);
-	nv_wo32(dev, ctx, 0x05ac/4, 0xb0000000);
+		nv_wo32(ctx, i, 0x07ff0000);
+	nv_wo32(ctx, 0x054c, 0x4b7fffff);
+	nv_wo32(ctx, 0x0588, 0x00000080);
+	nv_wo32(ctx, 0x058c, 0x30201000);
+	nv_wo32(ctx, 0x0590, 0x70605040);
+	nv_wo32(ctx, 0x0594, 0xb8a89888);
+	nv_wo32(ctx, 0x0598, 0xf8e8d8c8);
+	nv_wo32(ctx, 0x05ac, 0xb0000000);
 	for (i = 0x0604; i < 0x0644; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00010588);
+		nv_wo32(ctx, i, 0x00010588);
 	for (i = 0x0644; i < 0x0684; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00030303);
+		nv_wo32(ctx, i, 0x00030303);
 	for (i = 0x06c4; i < 0x0704; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0008aae4);
+		nv_wo32(ctx, i, 0x0008aae4);
 	for (i = 0x0704; i < 0x0744; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x01012000);
+		nv_wo32(ctx, i, 0x01012000);
 	for (i = 0x0744; i < 0x0784; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00080008);
-	nv_wo32(dev, ctx, 0x0860/4, 0x00040000);
-	nv_wo32(dev, ctx, 0x0864/4, 0x00010000);
+		nv_wo32(ctx, i, 0x00080008);
+	nv_wo32(ctx, 0x0860, 0x00040000);
+	nv_wo32(ctx, 0x0864, 0x00010000);
 	for (i = 0x0868; i < 0x0878; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x00040004);
+		nv_wo32(ctx, i, 0x00040004);
 	for (i = 0x1f1c; i <= 0x308c ; i += 16) {
-		nv_wo32(dev, ctx, i/4 + 0, 0x10700ff9);
-		nv_wo32(dev, ctx, i/4 + 1, 0x0436086c);
-		nv_wo32(dev, ctx, i/4 + 2, 0x000c001b);
+		nv_wo32(ctx, i + 0, 0x10700ff9);
+		nv_wo32(ctx, i + 4, 0x0436086c);
+		nv_wo32(ctx, i + 8, 0x000c001b);
 	}
 	for (i = 0x30bc; i < 0x30cc; i += 4)
-		nv_wo32(dev, ctx, i/4, 0x0000ffff);
-	nv_wo32(dev, ctx, 0x3450/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x380c/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3820/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x384c/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x3850/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3854/4, 0x3f000000);
-	nv_wo32(dev, ctx, 0x385c/4, 0x40000000);
-	nv_wo32(dev, ctx, 0x3860/4, 0x3f800000);
-	nv_wo32(dev, ctx, 0x3868/4, 0xbf800000);
-	nv_wo32(dev, ctx, 0x3870/4, 0xbf800000);
+		nv_wo32(ctx, i, 0x0000ffff);
+	nv_wo32(ctx, 0x3450, 0x3f800000);
+	nv_wo32(ctx, 0x380c, 0x3f800000);
+	nv_wo32(ctx, 0x3820, 0x3f800000);
+	nv_wo32(ctx, 0x384c, 0x40000000);
+	nv_wo32(ctx, 0x3850, 0x3f800000);
+	nv_wo32(ctx, 0x3854, 0x3f000000);
+	nv_wo32(ctx, 0x385c, 0x40000000);
+	nv_wo32(ctx, 0x3860, 0x3f800000);
+	nv_wo32(ctx, 0x3868, 0xbf800000);
+	nv_wo32(ctx, 0x3870, 0xbf800000);
 }
 
 int
@@ -370,68 +370,52 @@ nv20_graph_create_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
 	void (*ctx_init)(struct drm_device *, struct nouveau_gpuobj *);
-	unsigned int ctx_size;
-	unsigned int idoffs = 0x28/4;
+	unsigned int idoffs = 0x28;
 	int ret;
 
 	switch (dev_priv->chipset) {
 	case 0x20:
-		ctx_size = NV20_GRCTX_SIZE;
 		ctx_init = nv20_graph_context_init;
 		idoffs = 0;
 		break;
 	case 0x25:
 	case 0x28:
-		ctx_size = NV25_GRCTX_SIZE;
 		ctx_init = nv25_graph_context_init;
 		break;
 	case 0x2a:
-		ctx_size = NV2A_GRCTX_SIZE;
 		ctx_init = nv2a_graph_context_init;
 		idoffs = 0;
 		break;
 	case 0x30:
 	case 0x31:
-		ctx_size = NV30_31_GRCTX_SIZE;
 		ctx_init = nv30_31_graph_context_init;
 		break;
 	case 0x34:
-		ctx_size = NV34_GRCTX_SIZE;
 		ctx_init = nv34_graph_context_init;
 		break;
 	case 0x35:
 	case 0x36:
-		ctx_size = NV35_36_GRCTX_SIZE;
 		ctx_init = nv35_36_graph_context_init;
 		break;
 	default:
-		ctx_size = 0;
-		ctx_init = nv35_36_graph_context_init;
-		NV_ERROR(dev, "Please contact the devs if you want your NV%x"
-			      " card to work\n", dev_priv->chipset);
-		return -ENOSYS;
-		break;
+		BUG_ON(1);
 	}
 
-	ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, ctx_size, 16,
-					  NVOBJ_FLAG_ZERO_ALLOC,
-					  &chan->ramin_grctx);
+	ret = nouveau_gpuobj_new(dev, chan, pgraph->grctx_size, 16,
+				 NVOBJ_FLAG_ZERO_ALLOC, &chan->ramin_grctx);
 	if (ret)
 		return ret;
 
 	/* Initialise default context values */
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	ctx_init(dev, chan->ramin_grctx->gpuobj);
+	ctx_init(dev, chan->ramin_grctx);
 
 	/* nv20: nv_wo32(dev, chan->ramin_grctx->gpuobj, 10, chan->id<<24); */
-	nv_wo32(dev, chan->ramin_grctx->gpuobj, idoffs,
-					(chan->id << 24) | 0x1); /* CTX_USER */
+	nv_wo32(chan->ramin_grctx, idoffs,
+		(chan->id << 24) | 0x1); /* CTX_USER */
 
-	nv_wo32(dev, dev_priv->ctx_table->gpuobj, chan->id,
-			chan->ramin_grctx->instance >> 4);
-
-	dev_priv->engine.instmem.finish_access(dev);
+	nv_wo32(pgraph->ctx_table, chan->id * 4, chan->ramin_grctx->pinst >> 4);
 	return 0;
 }
 
@@ -440,13 +424,10 @@ nv20_graph_destroy_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
 
-	if (chan->ramin_grctx)
-		nouveau_gpuobj_ref_del(dev, &chan->ramin_grctx);
-
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	nv_wo32(dev, dev_priv->ctx_table->gpuobj, chan->id, 0);
-	dev_priv->engine.instmem.finish_access(dev);
+	nouveau_gpuobj_ref(NULL, &chan->ramin_grctx);
+	nv_wo32(pgraph->ctx_table, chan->id * 4, 0);
 }
 
 int
@@ -457,7 +438,7 @@ nv20_graph_load_context(struct nouveau_channel *chan)
 
 	if (!chan->ramin_grctx)
 		return -EINVAL;
-	inst = chan->ramin_grctx->instance >> 4;
+	inst = chan->ramin_grctx->pinst >> 4;
 
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_POINTER, inst);
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_XFER,
@@ -480,7 +461,7 @@ nv20_graph_unload_context(struct drm_device *dev)
 	chan = pgraph->channel(dev);
 	if (!chan)
 		return 0;
-	inst = chan->ramin_grctx->instance >> 4;
+	inst = chan->ramin_grctx->pinst >> 4;
 
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_POINTER, inst);
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_XFER,
@@ -538,29 +519,44 @@ nv20_graph_set_region_tiling(struct drm_device *dev, int i, uint32_t addr,
 int
 nv20_graph_init(struct drm_device *dev)
 {
-	struct drm_nouveau_private *dev_priv =
-		(struct drm_nouveau_private *)dev->dev_private;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
 	uint32_t tmp, vramsz;
 	int ret, i;
 
+	switch (dev_priv->chipset) {
+	case 0x20:
+		pgraph->grctx_size = NV20_GRCTX_SIZE;
+		break;
+	case 0x25:
+	case 0x28:
+		pgraph->grctx_size = NV25_GRCTX_SIZE;
+		break;
+	case 0x2a:
+		pgraph->grctx_size = NV2A_GRCTX_SIZE;
+		break;
+	default:
+		NV_ERROR(dev, "unknown chipset, disabling acceleration\n");
+		pgraph->accel_blocked = true;
+		return 0;
+	}
+
 	nv_wr32(dev, NV03_PMC_ENABLE,
 		nv_rd32(dev, NV03_PMC_ENABLE) & ~NV_PMC_ENABLE_PGRAPH);
 	nv_wr32(dev, NV03_PMC_ENABLE,
 		nv_rd32(dev, NV03_PMC_ENABLE) |  NV_PMC_ENABLE_PGRAPH);
 
-	if (!dev_priv->ctx_table) {
+	if (!pgraph->ctx_table) {
 		/* Create Context Pointer Table */
-		dev_priv->ctx_table_size = 32 * 4;
-		ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0,
-						  dev_priv->ctx_table_size, 16,
-						  NVOBJ_FLAG_ZERO_ALLOC,
-						  &dev_priv->ctx_table);
+		ret = nouveau_gpuobj_new(dev, NULL, 32 * 4, 16,
+					 NVOBJ_FLAG_ZERO_ALLOC,
+					 &pgraph->ctx_table);
 		if (ret)
 			return ret;
 	}
 
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_TABLE,
-		 dev_priv->ctx_table->instance >> 4);
+		     pgraph->ctx_table->pinst >> 4);
 
 	nv20_graph_rdi(dev);
 
@@ -644,34 +640,52 @@ void
 nv20_graph_takedown(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
 
-	nouveau_gpuobj_ref_del(dev, &dev_priv->ctx_table);
+	nouveau_gpuobj_ref(NULL, &pgraph->ctx_table);
 }
 
 int
 nv30_graph_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
 	int ret, i;
 
+	switch (dev_priv->chipset) {
+	case 0x30:
+	case 0x31:
+		pgraph->grctx_size = NV30_31_GRCTX_SIZE;
+		break;
+	case 0x34:
+		pgraph->grctx_size = NV34_GRCTX_SIZE;
+		break;
+	case 0x35:
+	case 0x36:
+		pgraph->grctx_size = NV35_36_GRCTX_SIZE;
+		break;
+	default:
+		NV_ERROR(dev, "unknown chipset, disabling acceleration\n");
+		pgraph->accel_blocked = true;
+		return 0;
+	}
+
 	nv_wr32(dev, NV03_PMC_ENABLE,
 		nv_rd32(dev, NV03_PMC_ENABLE) & ~NV_PMC_ENABLE_PGRAPH);
 	nv_wr32(dev, NV03_PMC_ENABLE,
 		nv_rd32(dev, NV03_PMC_ENABLE) |  NV_PMC_ENABLE_PGRAPH);
 
-	if (!dev_priv->ctx_table) {
+	if (!pgraph->ctx_table) {
 		/* Create Context Pointer Table */
-		dev_priv->ctx_table_size = 32 * 4;
-		ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0,
-						  dev_priv->ctx_table_size, 16,
-						  NVOBJ_FLAG_ZERO_ALLOC,
-						  &dev_priv->ctx_table);
+		ret = nouveau_gpuobj_new(dev, NULL, 32 * 4, 16,
+					 NVOBJ_FLAG_ZERO_ALLOC,
+					 &pgraph->ctx_table);
 		if (ret)
 			return ret;
 	}
 
 	nv_wr32(dev, NV20_PGRAPH_CHANNEL_CTX_TABLE,
-			dev_priv->ctx_table->instance >> 4);
+		     pgraph->ctx_table->pinst >> 4);
 
 	nv_wr32(dev, NV03_PGRAPH_INTR   , 0xFFFFFFFF);
 	nv_wr32(dev, NV03_PGRAPH_INTR_EN, 0xFFFFFFFF);
diff --git a/drivers/gpu/drm/nouveau/nv30_fb.c b/drivers/gpu/drm/nouveau/nv30_fb.c
new file mode 100644
index 0000000..4a3f2f0
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nv30_fb.c
@@ -0,0 +1,95 @@
+/*
+ * Copyright (C) 2010 Francisco Jerez.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
+ * a copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial
+ * portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "drmP.h"
+#include "drm.h"
+#include "nouveau_drv.h"
+#include "nouveau_drm.h"
+
+static int
+calc_bias(struct drm_device *dev, int k, int i, int j)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	int b = (dev_priv->chipset > 0x30 ?
+		 nv_rd32(dev, 0x122c + 0x10 * k + 0x4 * j) >> (4 * (i ^ 1)) :
+		 0) & 0xf;
+
+	return 2 * (b & 0x8 ? b - 0x10 : b);
+}
+
+static int
+calc_ref(struct drm_device *dev, int l, int k, int i)
+{
+	int j, x = 0;
+
+	for (j = 0; j < 4; j++) {
+		int m = (l >> (8 * i) & 0xff) + calc_bias(dev, k, i, j);
+
+		x |= (0x80 | clamp(m, 0, 0x1f)) << (8 * j);
+	}
+
+	return x;
+}
+
+int
+nv30_fb_init(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_fb_engine *pfb = &dev_priv->engine.fb;
+	int i, j;
+
+	pfb->num_tiles = NV10_PFB_TILE__SIZE;
+
+	/* Turn all the tiling regions off. */
+	for (i = 0; i < pfb->num_tiles; i++)
+		pfb->set_region_tiling(dev, i, 0, 0, 0);
+
+	/* Init the memory timing regs at 0x10037c/0x1003ac */
+	if (dev_priv->chipset == 0x30 ||
+	    dev_priv->chipset == 0x31 ||
+	    dev_priv->chipset == 0x35) {
+		/* Related to ROP count */
+		int n = (dev_priv->chipset == 0x31 ? 2 : 4);
+		int l = nv_rd32(dev, 0x1003d0);
+
+		for (i = 0; i < n; i++) {
+			for (j = 0; j < 3; j++)
+				nv_wr32(dev, 0x10037c + 0xc * i + 0x4 * j,
+					calc_ref(dev, l, 0, j));
+
+			for (j = 0; j < 2; j++)
+				nv_wr32(dev, 0x1003ac + 0x8 * i + 0x4 * j,
+					calc_ref(dev, l, 1, j));
+		}
+	}
+
+	return 0;
+}
+
+void
+nv30_fb_takedown(struct drm_device *dev)
+{
+}
diff --git a/drivers/gpu/drm/nouveau/nv40_fifo.c b/drivers/gpu/drm/nouveau/nv40_fifo.c
index 500ccfd..d337b8b 100644
--- a/drivers/gpu/drm/nouveau/nv40_fifo.c
+++ b/drivers/gpu/drm/nouveau/nv40_fifo.c
@@ -27,8 +27,9 @@
 #include "drmP.h"
 #include "nouveau_drv.h"
 #include "nouveau_drm.h"
+#include "nouveau_ramht.h"
 
-#define NV40_RAMFC(c) (dev_priv->ramfc_offset + ((c) * NV40_RAMFC__SIZE))
+#define NV40_RAMFC(c) (dev_priv->ramfc->pinst + ((c) * NV40_RAMFC__SIZE))
 #define NV40_RAMFC__SIZE 128
 
 int
@@ -42,16 +43,15 @@ nv40_fifo_create_context(struct nouveau_channel *chan)
 
 	ret = nouveau_gpuobj_new_fake(dev, NV40_RAMFC(chan->id), ~0,
 				      NV40_RAMFC__SIZE, NVOBJ_FLAG_ZERO_ALLOC |
-				      NVOBJ_FLAG_ZERO_FREE, NULL, &chan->ramfc);
+				      NVOBJ_FLAG_ZERO_FREE, &chan->ramfc);
 	if (ret)
 		return ret;
 
 	spin_lock_irqsave(&dev_priv->context_switch_lock, flags);
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	nv_wi32(dev, fc +  0, chan->pushbuf_base);
 	nv_wi32(dev, fc +  4, chan->pushbuf_base);
-	nv_wi32(dev, fc + 12, chan->pushbuf->instance >> 4);
+	nv_wi32(dev, fc + 12, chan->pushbuf->pinst >> 4);
 	nv_wi32(dev, fc + 24, NV_PFIFO_CACHE1_DMA_FETCH_TRIG_128_BYTES |
 			      NV_PFIFO_CACHE1_DMA_FETCH_SIZE_128_BYTES |
 			      NV_PFIFO_CACHE1_DMA_FETCH_MAX_REQS_8 |
@@ -59,9 +59,8 @@ nv40_fifo_create_context(struct nouveau_channel *chan)
 			      NV_PFIFO_CACHE1_BIG_ENDIAN |
 #endif
 			      0x30000000 /* no idea.. */);
-	nv_wi32(dev, fc + 56, chan->ramin_grctx->instance >> 4);
+	nv_wi32(dev, fc + 56, chan->ramin_grctx->pinst >> 4);
 	nv_wi32(dev, fc + 60, 0x0001FFFF);
-	dev_priv->engine.instmem.finish_access(dev);
 
 	/* enable the fifo dma operation */
 	nv_wr32(dev, NV04_PFIFO_MODE,
@@ -79,8 +78,7 @@ nv40_fifo_destroy_context(struct nouveau_channel *chan)
 	nv_wr32(dev, NV04_PFIFO_MODE,
 		nv_rd32(dev, NV04_PFIFO_MODE) & ~(1 << chan->id));
 
-	if (chan->ramfc)
-		nouveau_gpuobj_ref_del(dev, &chan->ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->ramfc);
 }
 
 static void
@@ -89,8 +87,6 @@ nv40_fifo_do_load_context(struct drm_device *dev, int chid)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	uint32_t fc = NV40_RAMFC(chid), tmp, tmp2;
 
-	dev_priv->engine.instmem.prepare_access(dev, false);
-
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_PUT, nv_ri32(dev, fc + 0));
 	nv_wr32(dev, NV04_PFIFO_CACHE1_DMA_GET, nv_ri32(dev, fc + 4));
 	nv_wr32(dev, NV10_PFIFO_CACHE1_REF_CNT, nv_ri32(dev, fc + 8));
@@ -127,8 +123,6 @@ nv40_fifo_do_load_context(struct drm_device *dev, int chid)
 	nv_wr32(dev, 0x2088, nv_ri32(dev, fc + 76));
 	nv_wr32(dev, 0x3300, nv_ri32(dev, fc + 80));
 
-	dev_priv->engine.instmem.finish_access(dev);
-
 	nv_wr32(dev, NV03_PFIFO_CACHE1_GET, 0);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUT, 0);
 }
@@ -166,7 +160,6 @@ nv40_fifo_unload_context(struct drm_device *dev)
 		return 0;
 	fc = NV40_RAMFC(chid);
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	nv_wi32(dev, fc + 0, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_PUT));
 	nv_wi32(dev, fc + 4, nv_rd32(dev, NV04_PFIFO_CACHE1_DMA_GET));
 	nv_wi32(dev, fc + 8, nv_rd32(dev, NV10_PFIFO_CACHE1_REF_CNT));
@@ -200,7 +193,6 @@ nv40_fifo_unload_context(struct drm_device *dev)
 	tmp |= (nv_rd32(dev, NV04_PFIFO_CACHE1_PUT) << 16);
 	nv_wi32(dev, fc + 72, tmp);
 #endif
-	dev_priv->engine.instmem.finish_access(dev);
 
 	nv40_fifo_do_load_context(dev, pfifo->channels - 1);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUSH1,
@@ -249,9 +241,9 @@ nv40_fifo_init_ramxx(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
 	nv_wr32(dev, NV03_PFIFO_RAMHT, (0x03 << 24) /* search 128 */ |
-				       ((dev_priv->ramht_bits - 9) << 16) |
-				       (dev_priv->ramht_offset >> 8));
-	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro_offset>>8);
+				       ((dev_priv->ramht->bits - 9) << 16) |
+				       (dev_priv->ramht->gpuobj->pinst >> 8));
+	nv_wr32(dev, NV03_PFIFO_RAMRO, dev_priv->ramro->pinst >> 8);
 
 	switch (dev_priv->chipset) {
 	case 0x47:
@@ -279,7 +271,7 @@ nv40_fifo_init_ramxx(struct drm_device *dev)
 		nv_wr32(dev, 0x2230, 0);
 		nv_wr32(dev, NV40_PFIFO_RAMFC,
 			((dev_priv->vram_size - 512 * 1024 +
-			  dev_priv->ramfc_offset) >> 16) | (3 << 16));
+			  dev_priv->ramfc->pinst) >> 16) | (3 << 16));
 		break;
 	}
 }
diff --git a/drivers/gpu/drm/nouveau/nv40_graph.c b/drivers/gpu/drm/nouveau/nv40_graph.c
index 704a25d..2424289 100644
--- a/drivers/gpu/drm/nouveau/nv40_graph.c
+++ b/drivers/gpu/drm/nouveau/nv40_graph.c
@@ -45,7 +45,7 @@ nv40_graph_channel(struct drm_device *dev)
 		struct nouveau_channel *chan = dev_priv->fifos[i];
 
 		if (chan && chan->ramin_grctx &&
-		    chan->ramin_grctx->instance == inst)
+		    chan->ramin_grctx->pinst == inst)
 			return chan;
 	}
 
@@ -58,36 +58,28 @@ nv40_graph_create_context(struct nouveau_channel *chan)
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
+	struct nouveau_grctx ctx = {};
 	int ret;
 
-	ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, pgraph->grctx_size,
-				     16, NVOBJ_FLAG_ZERO_ALLOC,
-				     &chan->ramin_grctx);
+	ret = nouveau_gpuobj_new(dev, chan, pgraph->grctx_size, 16,
+				 NVOBJ_FLAG_ZERO_ALLOC, &chan->ramin_grctx);
 	if (ret)
 		return ret;
 
 	/* Initialise default context values */
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	if (!pgraph->ctxprog) {
-		struct nouveau_grctx ctx = {};
-
-		ctx.dev = chan->dev;
-		ctx.mode = NOUVEAU_GRCTX_VALS;
-		ctx.data = chan->ramin_grctx->gpuobj;
-		nv40_grctx_init(&ctx);
-	} else {
-		nouveau_grctx_vals_load(dev, chan->ramin_grctx->gpuobj);
-	}
-	nv_wo32(dev, chan->ramin_grctx->gpuobj, 0,
-		     chan->ramin_grctx->gpuobj->im_pramin->start);
-	dev_priv->engine.instmem.finish_access(dev);
+	ctx.dev = chan->dev;
+	ctx.mode = NOUVEAU_GRCTX_VALS;
+	ctx.data = chan->ramin_grctx;
+	nv40_grctx_init(&ctx);
+
+	nv_wo32(chan->ramin_grctx, 0, chan->ramin_grctx->pinst);
 	return 0;
 }
 
 void
 nv40_graph_destroy_context(struct nouveau_channel *chan)
 {
-	nouveau_gpuobj_ref_del(chan->dev, &chan->ramin_grctx);
+	nouveau_gpuobj_ref(NULL, &chan->ramin_grctx);
 }
 
 static int
@@ -141,7 +133,7 @@ nv40_graph_load_context(struct nouveau_channel *chan)
 
 	if (!chan->ramin_grctx)
 		return -EINVAL;
-	inst = chan->ramin_grctx->instance >> 4;
+	inst = chan->ramin_grctx->pinst >> 4;
 
 	ret = nv40_graph_transfer_context(dev, inst, 0);
 	if (ret)
@@ -238,7 +230,8 @@ nv40_graph_init(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv =
 		(struct drm_nouveau_private *)dev->dev_private;
 	struct nouveau_fb_engine *pfb = &dev_priv->engine.fb;
-	uint32_t vramsz;
+	struct nouveau_grctx ctx = {};
+	uint32_t vramsz, *cp;
 	int i, j;
 
 	nv_wr32(dev, NV03_PMC_ENABLE, nv_rd32(dev, NV03_PMC_ENABLE) &
@@ -246,32 +239,22 @@ nv40_graph_init(struct drm_device *dev)
 	nv_wr32(dev, NV03_PMC_ENABLE, nv_rd32(dev, NV03_PMC_ENABLE) |
 			 NV_PMC_ENABLE_PGRAPH);
 
-	if (nouveau_ctxfw) {
-		nouveau_grctx_prog_load(dev);
-		dev_priv->engine.graph.grctx_size = 175 * 1024;
-	}
+	cp = kmalloc(sizeof(*cp) * 256, GFP_KERNEL);
+	if (!cp)
+		return -ENOMEM;
 
-	if (!dev_priv->engine.graph.ctxprog) {
-		struct nouveau_grctx ctx = {};
-		uint32_t *cp;
+	ctx.dev = dev;
+	ctx.mode = NOUVEAU_GRCTX_PROG;
+	ctx.data = cp;
+	ctx.ctxprog_max = 256;
+	nv40_grctx_init(&ctx);
+	dev_priv->engine.graph.grctx_size = ctx.ctxvals_pos * 4;
 
-		cp = kmalloc(sizeof(*cp) * 256, GFP_KERNEL);
-		if (!cp)
-			return -ENOMEM;
+	nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_INDEX, 0);
+	for (i = 0; i < ctx.ctxprog_len; i++)
+		nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_DATA, cp[i]);
 
-		ctx.dev = dev;
-		ctx.mode = NOUVEAU_GRCTX_PROG;
-		ctx.data = cp;
-		ctx.ctxprog_max = 256;
-		nv40_grctx_init(&ctx);
-		dev_priv->engine.graph.grctx_size = ctx.ctxvals_pos * 4;
-
-		nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_INDEX, 0);
-		for (i = 0; i < ctx.ctxprog_len; i++)
-			nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_DATA, cp[i]);
-
-		kfree(cp);
-	}
+	kfree(cp);
 
 	/* No context present currently */
 	nv_wr32(dev, NV40_PGRAPH_CTXCTL_CUR, 0x00000000);
@@ -407,7 +390,6 @@ nv40_graph_init(struct drm_device *dev)
 
 void nv40_graph_takedown(struct drm_device *dev)
 {
-	nouveau_grctx_fini(dev);
 }
 
 struct nouveau_pgraph_object_class nv40_graph_grclass[] = {
diff --git a/drivers/gpu/drm/nouveau/nv40_grctx.c b/drivers/gpu/drm/nouveau/nv40_grctx.c
index 9b5c974..ce58509 100644
--- a/drivers/gpu/drm/nouveau/nv40_grctx.c
+++ b/drivers/gpu/drm/nouveau/nv40_grctx.c
@@ -596,13 +596,13 @@ nv40_graph_construct_shader(struct nouveau_grctx *ctx)
 
 	offset += 0x0280/4;
 	for (i = 0; i < 16; i++, offset += 2)
-		nv_wo32(dev, obj, offset, 0x3f800000);
+		nv_wo32(obj, offset * 4, 0x3f800000);
 
 	for (vs = 0; vs < vs_nr; vs++, offset += vs_len) {
 		for (i = 0; i < vs_nr_b0 * 6; i += 6)
-			nv_wo32(dev, obj, offset + b0_offset + i, 0x00000001);
+			nv_wo32(obj, (offset + b0_offset + i) * 4, 0x00000001);
 		for (i = 0; i < vs_nr_b1 * 4; i += 4)
-			nv_wo32(dev, obj, offset + b1_offset + i, 0x3f800000);
+			nv_wo32(obj, (offset + b1_offset + i) * 4, 0x3f800000);
 	}
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv40_mc.c b/drivers/gpu/drm/nouveau/nv40_mc.c
index 2a3495e..e4e72c1 100644
--- a/drivers/gpu/drm/nouveau/nv40_mc.c
+++ b/drivers/gpu/drm/nouveau/nv40_mc.c
@@ -19,7 +19,7 @@ nv40_mc_init(struct drm_device *dev)
 	case 0x46: /* G72 */
 	case 0x4e:
 	case 0x4c: /* C51_G7X */
-		tmp = nv_rd32(dev, NV40_PFB_020C);
+		tmp = nv_rd32(dev, NV04_PFB_FIFO_DATA);
 		nv_wr32(dev, NV40_PMC_1700, tmp);
 		nv_wr32(dev, NV40_PMC_1704, 0);
 		nv_wr32(dev, NV40_PMC_1708, 0);
diff --git a/drivers/gpu/drm/nouveau/nv50_crtc.c b/drivers/gpu/drm/nouveau/nv50_crtc.c
index b4e4a3b..2423c92 100644
--- a/drivers/gpu/drm/nouveau/nv50_crtc.c
+++ b/drivers/gpu/drm/nouveau/nv50_crtc.c
@@ -264,11 +264,16 @@ nv50_crtc_set_scale(struct nouveau_crtc *nv_crtc, int scaling_mode, bool update)
 int
 nv50_crtc_set_clock(struct drm_device *dev, int head, int pclk)
 {
-	uint32_t reg = NV50_PDISPLAY_CRTC_CLK_CTRL1(head);
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct pll_lims pll;
-	uint32_t reg1, reg2;
+	uint32_t reg, reg1, reg2;
 	int ret, N1, M1, N2, M2, P;
 
+	if (dev_priv->chipset < NV_C0)
+		reg = NV50_PDISPLAY_CRTC_CLK_CTRL1(head);
+	else
+		reg = 0x614140 + (head * 0x800);
+
 	ret = get_pll_limits(dev, reg, &pll);
 	if (ret)
 		return ret;
@@ -286,7 +291,8 @@ nv50_crtc_set_clock(struct drm_device *dev, int head, int pclk)
 		nv_wr32(dev, reg, 0x10000611);
 		nv_wr32(dev, reg + 4, reg1 | (M1 << 16) | N1);
 		nv_wr32(dev, reg + 8, reg2 | (P << 28) | (M2 << 16) | N2);
-	} else {
+	} else
+	if (dev_priv->chipset < NV_C0) {
 		ret = nv50_calc_pll2(dev, &pll, pclk, &N1, &N2, &M1, &P);
 		if (ret <= 0)
 			return 0;
@@ -298,6 +304,17 @@ nv50_crtc_set_clock(struct drm_device *dev, int head, int pclk)
 		nv_wr32(dev, reg, 0x50000610);
 		nv_wr32(dev, reg + 4, reg1 | (P << 16) | (M1 << 8) | N1);
 		nv_wr32(dev, reg + 8, N2);
+	} else {
+		ret = nv50_calc_pll2(dev, &pll, pclk, &N1, &N2, &M1, &P);
+		if (ret <= 0)
+			return 0;
+
+		NV_DEBUG(dev, "pclk %d out %d N %d fN 0x%04x M %d P %d\n",
+			 pclk, ret, N1, N2, M1, P);
+
+		nv_mask(dev, reg + 0x0c, 0x00000000, 0x00000100);
+		nv_wr32(dev, reg + 0x04, (P << 16) | (N1 << 8) | M1);
+		nv_wr32(dev, reg + 0x10, N2 << 16);
 	}
 
 	return 0;
@@ -321,7 +338,9 @@ nv50_crtc_destroy(struct drm_crtc *crtc)
 
 	nv50_cursor_fini(nv_crtc);
 
+	nouveau_bo_unmap(nv_crtc->lut.nvbo);
 	nouveau_bo_ref(NULL, &nv_crtc->lut.nvbo);
+	nouveau_bo_unmap(nv_crtc->cursor.nvbo);
 	nouveau_bo_ref(NULL, &nv_crtc->cursor.nvbo);
 	kfree(nv_crtc->mode);
 	kfree(nv_crtc);
@@ -440,47 +459,15 @@ nv50_crtc_prepare(struct drm_crtc *crtc)
 {
 	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
-	struct drm_encoder *encoder;
-	uint32_t dac = 0, sor = 0;
 
 	NV_DEBUG_KMS(dev, "index %d\n", nv_crtc->index);
 
-	/* Disconnect all unused encoders. */
-	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
-		struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-
-		if (!drm_helper_encoder_in_use(encoder))
-			continue;
-
-		if (nv_encoder->dcb->type == OUTPUT_ANALOG ||
-		    nv_encoder->dcb->type == OUTPUT_TV)
-			dac |= (1 << nv_encoder->or);
-		else
-			sor |= (1 << nv_encoder->or);
-	}
-
-	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
-		struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-
-		if (nv_encoder->dcb->type == OUTPUT_ANALOG ||
-		    nv_encoder->dcb->type == OUTPUT_TV) {
-			if (dac & (1 << nv_encoder->or))
-				continue;
-		} else {
-			if (sor & (1 << nv_encoder->or))
-				continue;
-		}
-
-		nv_encoder->disconnect(nv_encoder);
-	}
-
 	nv50_crtc_blank(nv_crtc, true);
 }
 
 static void
 nv50_crtc_commit(struct drm_crtc *crtc)
 {
-	struct drm_crtc *crtc2;
 	struct drm_device *dev = crtc->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_channel *evo = dev_priv->evo;
@@ -491,20 +478,14 @@ nv50_crtc_commit(struct drm_crtc *crtc)
 
 	nv50_crtc_blank(nv_crtc, false);
 
-	/* Explicitly blank all unused crtc's. */
-	list_for_each_entry(crtc2, &dev->mode_config.crtc_list, head) {
-		if (!drm_helper_crtc_in_use(crtc2))
-			nv50_crtc_blank(nouveau_crtc(crtc2), true);
-	}
-
 	ret = RING_SPACE(evo, 2);
 	if (ret) {
 		NV_ERROR(dev, "no space while committing crtc\n");
 		return;
 	}
 	BEGIN_RING(evo, 0, NV50_EVO_UPDATE, 1);
-	OUT_RING(evo, 0);
-	FIRE_RING(evo);
+	OUT_RING  (evo, 0);
+	FIRE_RING (evo);
 }
 
 static bool
diff --git a/drivers/gpu/drm/nouveau/nv50_cursor.c b/drivers/gpu/drm/nouveau/nv50_cursor.c
index 03ad7ab..1b9ce30 100644
--- a/drivers/gpu/drm/nouveau/nv50_cursor.c
+++ b/drivers/gpu/drm/nouveau/nv50_cursor.c
@@ -147,7 +147,7 @@ nv50_cursor_fini(struct nouveau_crtc *nv_crtc)
 	NV_DEBUG_KMS(dev, "\n");
 
 	nv_wr32(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(idx), 0);
-	if (!nv_wait(NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(idx),
+	if (!nv_wait(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(idx),
 		     NV50_PDISPLAY_CURSOR_CURSOR_CTRL2_STATUS, 0)) {
 		NV_ERROR(dev, "timeout: CURSOR_CTRL2_STATUS == 0\n");
 		NV_ERROR(dev, "CURSOR_CTRL2 = 0x%08x\n",
diff --git a/drivers/gpu/drm/nouveau/nv50_dac.c b/drivers/gpu/drm/nouveau/nv50_dac.c
index 1fd9537..875414b 100644
--- a/drivers/gpu/drm/nouveau/nv50_dac.c
+++ b/drivers/gpu/drm/nouveau/nv50_dac.c
@@ -37,22 +37,31 @@
 #include "nv50_display.h"
 
 static void
-nv50_dac_disconnect(struct nouveau_encoder *nv_encoder)
+nv50_dac_disconnect(struct drm_encoder *encoder)
 {
-	struct drm_device *dev = to_drm_encoder(nv_encoder)->dev;
+	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
+	struct drm_device *dev = encoder->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_channel *evo = dev_priv->evo;
 	int ret;
 
+	if (!nv_encoder->crtc)
+		return;
+	nv50_crtc_blank(nouveau_crtc(nv_encoder->crtc), true);
+
 	NV_DEBUG_KMS(dev, "Disconnecting DAC %d\n", nv_encoder->or);
 
-	ret = RING_SPACE(evo, 2);
+	ret = RING_SPACE(evo, 4);
 	if (ret) {
 		NV_ERROR(dev, "no space while disconnecting DAC\n");
 		return;
 	}
 	BEGIN_RING(evo, 0, NV50_EVO_DAC(nv_encoder->or, MODE_CTRL), 1);
-	OUT_RING(evo, 0);
+	OUT_RING  (evo, 0);
+	BEGIN_RING(evo, 0, NV50_EVO_UPDATE, 1);
+	OUT_RING  (evo, 0);
+
+	nv_encoder->crtc = NULL;
 }
 
 static enum drm_connector_status
@@ -70,7 +79,7 @@ nv50_dac_detect(struct drm_encoder *encoder, struct drm_connector *connector)
 
 	nv_wr32(dev, NV50_PDISPLAY_DAC_DPMS_CTRL(or),
 		0x00150000 | NV50_PDISPLAY_DAC_DPMS_CTRL_PENDING);
-	if (!nv_wait(NV50_PDISPLAY_DAC_DPMS_CTRL(or),
+	if (!nv_wait(dev, NV50_PDISPLAY_DAC_DPMS_CTRL(or),
 		     NV50_PDISPLAY_DAC_DPMS_CTRL_PENDING, 0)) {
 		NV_ERROR(dev, "timeout: DAC_DPMS_CTRL_PENDING(%d) == 0\n", or);
 		NV_ERROR(dev, "DAC_DPMS_CTRL(%d) = 0x%08x\n", or,
@@ -121,7 +130,7 @@ nv50_dac_dpms(struct drm_encoder *encoder, int mode)
 	NV_DEBUG_KMS(dev, "or %d mode %d\n", or, mode);
 
 	/* wait for it to be done */
-	if (!nv_wait(NV50_PDISPLAY_DAC_DPMS_CTRL(or),
+	if (!nv_wait(dev, NV50_PDISPLAY_DAC_DPMS_CTRL(or),
 		     NV50_PDISPLAY_DAC_DPMS_CTRL_PENDING, 0)) {
 		NV_ERROR(dev, "timeout: DAC_DPMS_CTRL_PENDING(%d) == 0\n", or);
 		NV_ERROR(dev, "DAC_DPMS_CTRL(%d) = 0x%08x\n", or,
@@ -213,7 +222,8 @@ nv50_dac_mode_set(struct drm_encoder *encoder, struct drm_display_mode *mode,
 	uint32_t mode_ctl = 0, mode_ctl2 = 0;
 	int ret;
 
-	NV_DEBUG_KMS(dev, "or %d\n", nv_encoder->or);
+	NV_DEBUG_KMS(dev, "or %d type %d crtc %d\n",
+		     nv_encoder->or, nv_encoder->dcb->type, crtc->index);
 
 	nv50_dac_dpms(encoder, DRM_MODE_DPMS_ON);
 
@@ -243,6 +253,14 @@ nv50_dac_mode_set(struct drm_encoder *encoder, struct drm_display_mode *mode,
 	BEGIN_RING(evo, 0, NV50_EVO_DAC(nv_encoder->or, MODE_CTRL), 2);
 	OUT_RING(evo, mode_ctl);
 	OUT_RING(evo, mode_ctl2);
+
+	nv_encoder->crtc = encoder->crtc;
+}
+
+static struct drm_crtc *
+nv50_dac_crtc_get(struct drm_encoder *encoder)
+{
+	return nouveau_encoder(encoder)->crtc;
 }
 
 static const struct drm_encoder_helper_funcs nv50_dac_helper_funcs = {
@@ -253,7 +271,9 @@ static const struct drm_encoder_helper_funcs nv50_dac_helper_funcs = {
 	.prepare = nv50_dac_prepare,
 	.commit = nv50_dac_commit,
 	.mode_set = nv50_dac_mode_set,
-	.detect = nv50_dac_detect
+	.get_crtc = nv50_dac_crtc_get,
+	.detect = nv50_dac_detect,
+	.disable = nv50_dac_disconnect
 };
 
 static void
@@ -275,14 +295,11 @@ static const struct drm_encoder_funcs nv50_dac_encoder_funcs = {
 };
 
 int
-nv50_dac_create(struct drm_device *dev, struct dcb_entry *entry)
+nv50_dac_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
 	struct nouveau_encoder *nv_encoder;
 	struct drm_encoder *encoder;
 
-	NV_DEBUG_KMS(dev, "\n");
-	NV_INFO(dev, "Detected a DAC output\n");
-
 	nv_encoder = kzalloc(sizeof(*nv_encoder), GFP_KERNEL);
 	if (!nv_encoder)
 		return -ENOMEM;
@@ -291,14 +308,14 @@ nv50_dac_create(struct drm_device *dev, struct dcb_entry *entry)
 	nv_encoder->dcb = entry;
 	nv_encoder->or = ffs(entry->or) - 1;
 
-	nv_encoder->disconnect = nv50_dac_disconnect;
-
-	drm_encoder_init(dev, encoder, &nv50_dac_encoder_funcs,
+	drm_encoder_init(connector->dev, encoder, &nv50_dac_encoder_funcs,
 			 DRM_MODE_ENCODER_DAC);
 	drm_encoder_helper_add(encoder, &nv50_dac_helper_funcs);
 
 	encoder->possible_crtcs = entry->heads;
 	encoder->possible_clones = 0;
+
+	drm_mode_connector_attach_encoder(connector, encoder);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv50_display.c b/drivers/gpu/drm/nouveau/nv50_display.c
index 580a5d1..11d366a 100644
--- a/drivers/gpu/drm/nouveau/nv50_display.c
+++ b/drivers/gpu/drm/nouveau/nv50_display.c
@@ -30,8 +30,22 @@
 #include "nouveau_connector.h"
 #include "nouveau_fb.h"
 #include "nouveau_fbcon.h"
+#include "nouveau_ramht.h"
 #include "drm_crtc_helper.h"
 
+static inline int
+nv50_sor_nr(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->chipset  < 0x90 ||
+	    dev_priv->chipset == 0x92 ||
+	    dev_priv->chipset == 0xa0)
+		return 2;
+
+	return 4;
+}
+
 static void
 nv50_evo_channel_del(struct nouveau_channel **pchan)
 {
@@ -42,6 +56,7 @@ nv50_evo_channel_del(struct nouveau_channel **pchan)
 	*pchan = NULL;
 
 	nouveau_gpuobj_channel_takedown(chan);
+	nouveau_bo_unmap(chan->pushbuf_bo);
 	nouveau_bo_ref(NULL, &chan->pushbuf_bo);
 
 	if (chan->user)
@@ -65,21 +80,23 @@ nv50_evo_dmaobj_new(struct nouveau_channel *evo, uint32_t class, uint32_t name,
 		return ret;
 	obj->engine = NVOBJ_ENGINE_DISPLAY;
 
-	ret = nouveau_gpuobj_ref_add(dev, evo, name, obj, NULL);
+	nv_wo32(obj,  0, (tile_flags << 22) | (magic_flags << 16) | class);
+	nv_wo32(obj,  4, limit);
+	nv_wo32(obj,  8, offset);
+	nv_wo32(obj, 12, 0x00000000);
+	nv_wo32(obj, 16, 0x00000000);
+	if (dev_priv->card_type < NV_C0)
+		nv_wo32(obj, 20, 0x00010000);
+	else
+		nv_wo32(obj, 20, 0x00020000);
+	dev_priv->engine.instmem.flush(dev);
+
+	ret = nouveau_ramht_insert(evo, name, obj);
+	nouveau_gpuobj_ref(NULL, &obj);
 	if (ret) {
-		nouveau_gpuobj_del(dev, &obj);
 		return ret;
 	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	nv_wo32(dev, obj, 0, (tile_flags << 22) | (magic_flags << 16) | class);
-	nv_wo32(dev, obj, 1, limit);
-	nv_wo32(dev, obj, 2, offset);
-	nv_wo32(dev, obj, 3, 0x00000000);
-	nv_wo32(dev, obj, 4, 0x00000000);
-	nv_wo32(dev, obj, 5, 0x00010000);
-	dev_priv->engine.instmem.finish_access(dev);
-
 	return 0;
 }
 
@@ -87,6 +104,7 @@ static int
 nv50_evo_channel_new(struct drm_device *dev, struct nouveau_channel **pchan)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_gpuobj *ramht = NULL;
 	struct nouveau_channel *chan;
 	int ret;
 
@@ -100,32 +118,35 @@ nv50_evo_channel_new(struct drm_device *dev, struct nouveau_channel **pchan)
 	chan->user_get = 4;
 	chan->user_put = 0;
 
-	INIT_LIST_HEAD(&chan->ramht_refs);
-
-	ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0, 32768, 0x1000,
-				     NVOBJ_FLAG_ZERO_ALLOC, &chan->ramin);
+	ret = nouveau_gpuobj_new(dev, NULL, 32768, 0x1000,
+				 NVOBJ_FLAG_ZERO_ALLOC, &chan->ramin);
 	if (ret) {
 		NV_ERROR(dev, "Error allocating EVO channel memory: %d\n", ret);
 		nv50_evo_channel_del(pchan);
 		return ret;
 	}
 
-	ret = nouveau_mem_init_heap(&chan->ramin_heap, chan->ramin->gpuobj->
-				    im_pramin->start, 32768);
+	ret = drm_mm_init(&chan->ramin_heap, 0, 32768);
 	if (ret) {
 		NV_ERROR(dev, "Error initialising EVO PRAMIN heap: %d\n", ret);
 		nv50_evo_channel_del(pchan);
 		return ret;
 	}
 
-	ret = nouveau_gpuobj_new_ref(dev, chan, chan, 0, 4096, 16,
-				     0, &chan->ramht);
+	ret = nouveau_gpuobj_new(dev, chan, 4096, 16, 0, &ramht);
 	if (ret) {
 		NV_ERROR(dev, "Unable to allocate EVO RAMHT: %d\n", ret);
 		nv50_evo_channel_del(pchan);
 		return ret;
 	}
 
+	ret = nouveau_ramht_new(dev, ramht, &chan->ramht);
+	nouveau_gpuobj_ref(NULL, &ramht);
+	if (ret) {
+		nv50_evo_channel_del(pchan);
+		return ret;
+	}
+
 	if (dev_priv->chipset != 0x50) {
 		ret = nv50_evo_dmaobj_new(chan, 0x3d, NvEvoFB16, 0x70, 0x19,
 					  0, 0xffffffff);
@@ -179,13 +200,25 @@ nv50_evo_channel_new(struct drm_device *dev, struct nouveau_channel **pchan)
 }
 
 int
+nv50_display_early_init(struct drm_device *dev)
+{
+	return 0;
+}
+
+void
+nv50_display_late_takedown(struct drm_device *dev)
+{
+}
+
+int
 nv50_display_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_timer_engine *ptimer = &dev_priv->engine.timer;
+	struct nouveau_gpio_engine *pgpio = &dev_priv->engine.gpio;
 	struct nouveau_channel *evo = dev_priv->evo;
 	struct drm_connector *connector;
-	uint32_t val, ram_amount, hpd_en[2];
+	uint32_t val, ram_amount;
 	uint64_t start;
 	int ret, i;
 
@@ -213,11 +246,11 @@ nv50_display_init(struct drm_device *dev)
 		nv_wr32(dev, 0x006101d0 + (i * 0x04), val);
 	}
 	/* SOR */
-	for (i = 0; i < 4; i++) {
+	for (i = 0; i < nv50_sor_nr(dev); i++) {
 		val = nv_rd32(dev, 0x0061c000 + (i * 0x800));
 		nv_wr32(dev, 0x006101e0 + (i * 0x04), val);
 	}
-	/* Something not yet in use, tv-out maybe. */
+	/* EXT */
 	for (i = 0; i < 3; i++) {
 		val = nv_rd32(dev, 0x0061e000 + (i * 0x800));
 		nv_wr32(dev, 0x006101f0 + (i * 0x04), val);
@@ -246,7 +279,7 @@ nv50_display_init(struct drm_device *dev)
 	if (nv_rd32(dev, NV50_PDISPLAY_INTR_1) & 0x100) {
 		nv_wr32(dev, NV50_PDISPLAY_INTR_1, 0x100);
 		nv_wr32(dev, 0x006194e8, nv_rd32(dev, 0x006194e8) & ~1);
-		if (!nv_wait(0x006194e8, 2, 0)) {
+		if (!nv_wait(dev, 0x006194e8, 2, 0)) {
 			NV_ERROR(dev, "timeout: (0x6194e8 & 2) != 0\n");
 			NV_ERROR(dev, "0x6194e8 = 0x%08x\n",
 						nv_rd32(dev, 0x6194e8));
@@ -277,7 +310,8 @@ nv50_display_init(struct drm_device *dev)
 
 	nv_wr32(dev, NV50_PDISPLAY_CTRL_STATE, NV50_PDISPLAY_CTRL_STATE_ENABLE);
 	nv_wr32(dev, NV50_PDISPLAY_CHANNEL_STAT(0), 0x1000b03);
-	if (!nv_wait(NV50_PDISPLAY_CHANNEL_STAT(0), 0x40000000, 0x40000000)) {
+	if (!nv_wait(dev, NV50_PDISPLAY_CHANNEL_STAT(0),
+		     0x40000000, 0x40000000)) {
 		NV_ERROR(dev, "timeout: (0x610200 & 0x40000000) == 0x40000000\n");
 		NV_ERROR(dev, "0x610200 = 0x%08x\n",
 			  nv_rd32(dev, NV50_PDISPLAY_CHANNEL_STAT(0)));
@@ -286,7 +320,7 @@ nv50_display_init(struct drm_device *dev)
 
 	for (i = 0; i < 2; i++) {
 		nv_wr32(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i), 0x2000);
-		if (!nv_wait(NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i),
+		if (!nv_wait(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i),
 			     NV50_PDISPLAY_CURSOR_CURSOR_CTRL2_STATUS, 0)) {
 			NV_ERROR(dev, "timeout: CURSOR_CTRL2_STATUS == 0\n");
 			NV_ERROR(dev, "CURSOR_CTRL2 = 0x%08x\n",
@@ -296,7 +330,7 @@ nv50_display_init(struct drm_device *dev)
 
 		nv_wr32(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i),
 			NV50_PDISPLAY_CURSOR_CURSOR_CTRL2_ON);
-		if (!nv_wait(NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i),
+		if (!nv_wait(dev, NV50_PDISPLAY_CURSOR_CURSOR_CTRL2(i),
 			     NV50_PDISPLAY_CURSOR_CURSOR_CTRL2_STATUS,
 			     NV50_PDISPLAY_CURSOR_CURSOR_CTRL2_STATUS_ACTIVE)) {
 			NV_ERROR(dev, "timeout: "
@@ -307,7 +341,7 @@ nv50_display_init(struct drm_device *dev)
 		}
 	}
 
-	nv_wr32(dev, NV50_PDISPLAY_OBJECTS, (evo->ramin->instance >> 8) | 9);
+	nv_wr32(dev, NV50_PDISPLAY_OBJECTS, (evo->ramin->vinst >> 8) | 9);
 
 	/* initialise fifo */
 	nv_wr32(dev, NV50_PDISPLAY_CHANNEL_DMA_CB(0),
@@ -316,7 +350,7 @@ nv50_display_init(struct drm_device *dev)
 		NV50_PDISPLAY_CHANNEL_DMA_CB_VALID);
 	nv_wr32(dev, NV50_PDISPLAY_CHANNEL_UNK2(0), 0x00010000);
 	nv_wr32(dev, NV50_PDISPLAY_CHANNEL_UNK3(0), 0x00000002);
-	if (!nv_wait(0x610200, 0x80000000, 0x00000000)) {
+	if (!nv_wait(dev, 0x610200, 0x80000000, 0x00000000)) {
 		NV_ERROR(dev, "timeout: (0x610200 & 0x80000000) == 0\n");
 		NV_ERROR(dev, "0x610200 = 0x%08x\n", nv_rd32(dev, 0x610200));
 		return -EBUSY;
@@ -356,7 +390,7 @@ nv50_display_init(struct drm_device *dev)
 	BEGIN_RING(evo, 0, NV50_EVO_CRTC(0, UNK082C), 1);
 	OUT_RING(evo, 0);
 	FIRE_RING(evo);
-	if (!nv_wait(0x640004, 0xffffffff, evo->dma.put << 2))
+	if (!nv_wait(dev, 0x640004, 0xffffffff, evo->dma.put << 2))
 		NV_ERROR(dev, "evo pushbuf stalled\n");
 
 	/* enable clock change interrupts. */
@@ -366,26 +400,13 @@ nv50_display_init(struct drm_device *dev)
 					     NV50_PDISPLAY_INTR_EN_CLK_UNK40));
 
 	/* enable hotplug interrupts */
-	hpd_en[0] = hpd_en[1] = 0;
 	list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
 		struct nouveau_connector *conn = nouveau_connector(connector);
-		struct dcb_gpio_entry *gpio;
 
 		if (conn->dcb->gpio_tag == 0xff)
 			continue;
 
-		gpio = nouveau_bios_gpio_entry(dev, conn->dcb->gpio_tag);
-		if (!gpio)
-			continue;
-
-		hpd_en[gpio->line >> 4] |= (0x00010001 << (gpio->line & 0xf));
-	}
-
-	nv_wr32(dev, 0xe054, 0xffffffff);
-	nv_wr32(dev, 0xe050, hpd_en[0]);
-	if (dev_priv->chipset >= 0x90) {
-		nv_wr32(dev, 0xe074, 0xffffffff);
-		nv_wr32(dev, 0xe070, hpd_en[1]);
+		pgpio->irq_enable(dev, conn->dcb->gpio_tag, true);
 	}
 
 	return 0;
@@ -423,7 +444,7 @@ static int nv50_display_disable(struct drm_device *dev)
 			continue;
 
 		nv_wr32(dev, NV50_PDISPLAY_INTR_1, mask);
-		if (!nv_wait(NV50_PDISPLAY_INTR_1, mask, mask)) {
+		if (!nv_wait(dev, NV50_PDISPLAY_INTR_1, mask, mask)) {
 			NV_ERROR(dev, "timeout: (0x610024 & 0x%08x) == "
 				      "0x%08x\n", mask, mask);
 			NV_ERROR(dev, "0x610024 = 0x%08x\n",
@@ -433,14 +454,14 @@ static int nv50_display_disable(struct drm_device *dev)
 
 	nv_wr32(dev, NV50_PDISPLAY_CHANNEL_STAT(0), 0);
 	nv_wr32(dev, NV50_PDISPLAY_CTRL_STATE, 0);
-	if (!nv_wait(NV50_PDISPLAY_CHANNEL_STAT(0), 0x1e0000, 0)) {
+	if (!nv_wait(dev, NV50_PDISPLAY_CHANNEL_STAT(0), 0x1e0000, 0)) {
 		NV_ERROR(dev, "timeout: (0x610200 & 0x1e0000) == 0\n");
 		NV_ERROR(dev, "0x610200 = 0x%08x\n",
 			  nv_rd32(dev, NV50_PDISPLAY_CHANNEL_STAT(0)));
 	}
 
 	for (i = 0; i < 3; i++) {
-		if (!nv_wait(NV50_PDISPLAY_SOR_DPMS_STATE(i),
+		if (!nv_wait(dev, NV50_PDISPLAY_SOR_DPMS_STATE(i),
 			     NV50_PDISPLAY_SOR_DPMS_STATE_WAIT, 0)) {
 			NV_ERROR(dev, "timeout: SOR_DPMS_STATE_WAIT(%d) == 0\n", i);
 			NV_ERROR(dev, "SOR_DPMS_STATE(%d) = 0x%08x\n", i,
@@ -465,6 +486,7 @@ int nv50_display_create(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct dcb_table *dcb = &dev_priv->vbios.dcb;
+	struct drm_connector *connector, *ct;
 	int ret, i;
 
 	NV_DEBUG_KMS(dev, "\n");
@@ -507,14 +529,18 @@ int nv50_display_create(struct drm_device *dev)
 			continue;
 		}
 
+		connector = nouveau_connector_create(dev, entry->connector);
+		if (IS_ERR(connector))
+			continue;
+
 		switch (entry->type) {
 		case OUTPUT_TMDS:
 		case OUTPUT_LVDS:
 		case OUTPUT_DP:
-			nv50_sor_create(dev, entry);
+			nv50_sor_create(connector, entry);
 			break;
 		case OUTPUT_ANALOG:
-			nv50_dac_create(dev, entry);
+			nv50_dac_create(connector, entry);
 			break;
 		default:
 			NV_WARN(dev, "DCB encoder %d unknown\n", entry->type);
@@ -522,11 +548,13 @@ int nv50_display_create(struct drm_device *dev)
 		}
 	}
 
-	for (i = 0 ; i < dcb->connector.entries; i++) {
-		if (i != 0 && dcb->connector.entry[i].index2 ==
-			      dcb->connector.entry[i - 1].index2)
-			continue;
-		nouveau_connector_create(dev, &dcb->connector.entry[i]);
+	list_for_each_entry_safe(connector, ct,
+				 &dev->mode_config.connector_list, head) {
+		if (!connector->encoder_ids[0]) {
+			NV_WARN(dev, "%s has no encoders, removing\n",
+				drm_get_connector_name(connector));
+			connector->funcs->destroy(connector);
+		}
 	}
 
 	ret = nv50_display_init(dev);
@@ -538,7 +566,8 @@ int nv50_display_create(struct drm_device *dev)
 	return 0;
 }
 
-int nv50_display_destroy(struct drm_device *dev)
+void
+nv50_display_destroy(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 
@@ -548,135 +577,30 @@ int nv50_display_destroy(struct drm_device *dev)
 
 	nv50_display_disable(dev);
 	nv50_evo_channel_del(&dev_priv->evo);
-
-	return 0;
-}
-
-static inline uint32_t
-nv50_display_mode_ctrl(struct drm_device *dev, bool sor, int or)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t mc;
-
-	if (sor) {
-		if (dev_priv->chipset < 0x90 ||
-		    dev_priv->chipset == 0x92 || dev_priv->chipset == 0xa0)
-			mc = nv_rd32(dev, NV50_PDISPLAY_SOR_MODE_CTRL_P(or));
-		else
-			mc = nv_rd32(dev, NV90_PDISPLAY_SOR_MODE_CTRL_P(or));
-	} else {
-		mc = nv_rd32(dev, NV50_PDISPLAY_DAC_MODE_CTRL_P(or));
-	}
-
-	return mc;
-}
-
-static int
-nv50_display_irq_head(struct drm_device *dev, int *phead,
-		      struct dcb_entry **pdcbent)
-{
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	uint32_t unk30 = nv_rd32(dev, NV50_PDISPLAY_UNK30_CTRL);
-	uint32_t dac = 0, sor = 0;
-	int head, i, or = 0, type = OUTPUT_ANY;
-
-	/* We're assuming that head 0 *or* head 1 will be active here,
-	 * and not both.  I'm not sure if the hw will even signal both
-	 * ever, but it definitely shouldn't for us as we commit each
-	 * CRTC separately, and submission will be blocked by the GPU
-	 * until we handle each in turn.
-	 */
-	NV_DEBUG_KMS(dev, "0x610030: 0x%08x\n", unk30);
-	head = ffs((unk30 >> 9) & 3) - 1;
-	if (head < 0)
-		return -EINVAL;
-
-	/* This assumes CRTCs are never bound to multiple encoders, which
-	 * should be the case.
-	 */
-	for (i = 0; i < 3 && type == OUTPUT_ANY; i++) {
-		uint32_t mc = nv50_display_mode_ctrl(dev, false, i);
-		if (!(mc & (1 << head)))
-			continue;
-
-		switch ((mc >> 8) & 0xf) {
-		case 0: type = OUTPUT_ANALOG; break;
-		case 1: type = OUTPUT_TV; break;
-		default:
-			NV_ERROR(dev, "unknown dac mode_ctrl: 0x%08x\n", dac);
-			return -1;
-		}
-
-		or = i;
-	}
-
-	for (i = 0; i < 4 && type == OUTPUT_ANY; i++) {
-		uint32_t mc = nv50_display_mode_ctrl(dev, true, i);
-		if (!(mc & (1 << head)))
-			continue;
-
-		switch ((mc >> 8) & 0xf) {
-		case 0: type = OUTPUT_LVDS; break;
-		case 1: type = OUTPUT_TMDS; break;
-		case 2: type = OUTPUT_TMDS; break;
-		case 5: type = OUTPUT_TMDS; break;
-		case 8: type = OUTPUT_DP; break;
-		case 9: type = OUTPUT_DP; break;
-		default:
-			NV_ERROR(dev, "unknown sor mode_ctrl: 0x%08x\n", sor);
-			return -1;
-		}
-
-		or = i;
-	}
-
-	NV_DEBUG_KMS(dev, "type %d, or %d\n", type, or);
-	if (type == OUTPUT_ANY) {
-		NV_ERROR(dev, "unknown encoder!!\n");
-		return -1;
-	}
-
-	for (i = 0; i < dev_priv->vbios.dcb.entries; i++) {
-		struct dcb_entry *dcbent = &dev_priv->vbios.dcb.entry[i];
-
-		if (dcbent->type != type)
-			continue;
-
-		if (!(dcbent->or & (1 << or)))
-			continue;
-
-		*phead = head;
-		*pdcbent = dcbent;
-		return 0;
-	}
-
-	NV_ERROR(dev, "no DCB entry for %d %d\n", dac != 0, or);
-	return 0;
 }
 
-static uint32_t
-nv50_display_script_select(struct drm_device *dev, struct dcb_entry *dcbent,
-			   int pxclk)
+static u16
+nv50_display_script_select(struct drm_device *dev, struct dcb_entry *dcb,
+			   u32 mc, int pxclk)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_connector *nv_connector = NULL;
 	struct drm_encoder *encoder;
 	struct nvbios *bios = &dev_priv->vbios;
-	uint32_t mc, script = 0, or;
+	u32 script = 0, or;
 
 	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
 		struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
 
-		if (nv_encoder->dcb != dcbent)
+		if (nv_encoder->dcb != dcb)
 			continue;
 
 		nv_connector = nouveau_encoder_connector_get(nv_encoder);
 		break;
 	}
 
-	or = ffs(dcbent->or) - 1;
-	mc = nv50_display_mode_ctrl(dev, dcbent->type != OUTPUT_ANALOG, or);
-	switch (dcbent->type) {
+	or = ffs(dcb->or) - 1;
+	switch (dcb->type) {
 	case OUTPUT_LVDS:
 		script = (mc >> 8) & 0xf;
 		if (bios->fp_no_ddc) {
@@ -767,17 +691,88 @@ nv50_display_vblank_handler(struct drm_device *dev, uint32_t intr)
 static void
 nv50_display_unk10_handler(struct drm_device *dev)
 {
-	struct dcb_entry *dcbent;
-	int head, ret;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 unk30 = nv_rd32(dev, 0x610030), mc;
+	int i, crtc, or, type = OUTPUT_ANY;
 
-	ret = nv50_display_irq_head(dev, &head, &dcbent);
-	if (ret)
-		goto ack;
+	NV_DEBUG_KMS(dev, "0x610030: 0x%08x\n", unk30);
+	dev_priv->evo_irq.dcb = NULL;
 
 	nv_wr32(dev, 0x619494, nv_rd32(dev, 0x619494) & ~8);
 
-	nouveau_bios_run_display_table(dev, dcbent, 0, -1);
+	/* Determine which CRTC we're dealing with, only 1 ever will be
+	 * signalled at the same time with the current nouveau code.
+	 */
+	crtc = ffs((unk30 & 0x00000060) >> 5) - 1;
+	if (crtc < 0)
+		goto ack;
+
+	/* Nothing needs to be done for the encoder */
+	crtc = ffs((unk30 & 0x00000180) >> 7) - 1;
+	if (crtc < 0)
+		goto ack;
+
+	/* Find which encoder was connected to the CRTC */
+	for (i = 0; type == OUTPUT_ANY && i < 3; i++) {
+		mc = nv_rd32(dev, NV50_PDISPLAY_DAC_MODE_CTRL_C(i));
+		NV_DEBUG_KMS(dev, "DAC-%d mc: 0x%08x\n", i, mc);
+		if (!(mc & (1 << crtc)))
+			continue;
+
+		switch ((mc & 0x00000f00) >> 8) {
+		case 0: type = OUTPUT_ANALOG; break;
+		case 1: type = OUTPUT_TV; break;
+		default:
+			NV_ERROR(dev, "invalid mc, DAC-%d: 0x%08x\n", i, mc);
+			goto ack;
+		}
+
+		or = i;
+	}
+
+	for (i = 0; type == OUTPUT_ANY && i < nv50_sor_nr(dev); i++) {
+		if (dev_priv->chipset  < 0x90 ||
+		    dev_priv->chipset == 0x92 ||
+		    dev_priv->chipset == 0xa0)
+			mc = nv_rd32(dev, NV50_PDISPLAY_SOR_MODE_CTRL_C(i));
+		else
+			mc = nv_rd32(dev, NV90_PDISPLAY_SOR_MODE_CTRL_C(i));
+
+		NV_DEBUG_KMS(dev, "SOR-%d mc: 0x%08x\n", i, mc);
+		if (!(mc & (1 << crtc)))
+			continue;
+
+		switch ((mc & 0x00000f00) >> 8) {
+		case 0: type = OUTPUT_LVDS; break;
+		case 1: type = OUTPUT_TMDS; break;
+		case 2: type = OUTPUT_TMDS; break;
+		case 5: type = OUTPUT_TMDS; break;
+		case 8: type = OUTPUT_DP; break;
+		case 9: type = OUTPUT_DP; break;
+		default:
+			NV_ERROR(dev, "invalid mc, SOR-%d: 0x%08x\n", i, mc);
+			goto ack;
+		}
+
+		or = i;
+	}
+
+	/* There was no encoder to disable */
+	if (type == OUTPUT_ANY)
+		goto ack;
+
+	/* Disable the encoder */
+	for (i = 0; i < dev_priv->vbios.dcb.entries; i++) {
+		struct dcb_entry *dcb = &dev_priv->vbios.dcb.entry[i];
+
+		if (dcb->type == type && (dcb->or & (1 << or))) {
+			nouveau_bios_run_display_table(dev, dcb, 0, -1);
+			dev_priv->evo_irq.dcb = dcb;
+			goto ack;
+		}
+	}
 
+	NV_ERROR(dev, "no dcb for %d %d 0x%08x\n", or, type, mc);
 ack:
 	nv_wr32(dev, NV50_PDISPLAY_INTR_1, NV50_PDISPLAY_INTR_1_CLK_UNK10);
 	nv_wr32(dev, 0x610030, 0x80000000);
@@ -817,33 +812,103 @@ nv50_display_unk20_dp_hack(struct drm_device *dev, struct dcb_entry *dcb)
 static void
 nv50_display_unk20_handler(struct drm_device *dev)
 {
-	struct dcb_entry *dcbent;
-	uint32_t tmp, pclk, script;
-	int head, or, ret;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 unk30 = nv_rd32(dev, 0x610030), tmp, pclk, script, mc;
+	struct dcb_entry *dcb;
+	int i, crtc, or, type = OUTPUT_ANY;
 
-	ret = nv50_display_irq_head(dev, &head, &dcbent);
-	if (ret)
+	NV_DEBUG_KMS(dev, "0x610030: 0x%08x\n", unk30);
+	dcb = dev_priv->evo_irq.dcb;
+	if (dcb) {
+		nouveau_bios_run_display_table(dev, dcb, 0, -2);
+		dev_priv->evo_irq.dcb = NULL;
+	}
+
+	/* CRTC clock change requested? */
+	crtc = ffs((unk30 & 0x00000600) >> 9) - 1;
+	if (crtc >= 0) {
+		pclk  = nv_rd32(dev, NV50_PDISPLAY_CRTC_P(crtc, CLOCK));
+		pclk &= 0x003fffff;
+
+		nv50_crtc_set_clock(dev, crtc, pclk);
+
+		tmp = nv_rd32(dev, NV50_PDISPLAY_CRTC_CLK_CTRL2(crtc));
+		tmp &= ~0x000000f;
+		nv_wr32(dev, NV50_PDISPLAY_CRTC_CLK_CTRL2(crtc), tmp);
+	}
+
+	/* Nothing needs to be done for the encoder */
+	crtc = ffs((unk30 & 0x00000180) >> 7) - 1;
+	if (crtc < 0)
 		goto ack;
-	or = ffs(dcbent->or) - 1;
-	pclk = nv_rd32(dev, NV50_PDISPLAY_CRTC_P(head, CLOCK)) & 0x3fffff;
-	script = nv50_display_script_select(dev, dcbent, pclk);
+	pclk  = nv_rd32(dev, NV50_PDISPLAY_CRTC_P(crtc, CLOCK)) & 0x003fffff;
+
+	/* Find which encoder is connected to the CRTC */
+	for (i = 0; type == OUTPUT_ANY && i < 3; i++) {
+		mc = nv_rd32(dev, NV50_PDISPLAY_DAC_MODE_CTRL_P(i));
+		NV_DEBUG_KMS(dev, "DAC-%d mc: 0x%08x\n", i, mc);
+		if (!(mc & (1 << crtc)))
+			continue;
+
+		switch ((mc & 0x00000f00) >> 8) {
+		case 0: type = OUTPUT_ANALOG; break;
+		case 1: type = OUTPUT_TV; break;
+		default:
+			NV_ERROR(dev, "invalid mc, DAC-%d: 0x%08x\n", i, mc);
+			goto ack;
+		}
+
+		or = i;
+	}
+
+	for (i = 0; type == OUTPUT_ANY && i < nv50_sor_nr(dev); i++) {
+		if (dev_priv->chipset  < 0x90 ||
+		    dev_priv->chipset == 0x92 ||
+		    dev_priv->chipset == 0xa0)
+			mc = nv_rd32(dev, NV50_PDISPLAY_SOR_MODE_CTRL_P(i));
+		else
+			mc = nv_rd32(dev, NV90_PDISPLAY_SOR_MODE_CTRL_P(i));
+
+		NV_DEBUG_KMS(dev, "SOR-%d mc: 0x%08x\n", i, mc);
+		if (!(mc & (1 << crtc)))
+			continue;
+
+		switch ((mc & 0x00000f00) >> 8) {
+		case 0: type = OUTPUT_LVDS; break;
+		case 1: type = OUTPUT_TMDS; break;
+		case 2: type = OUTPUT_TMDS; break;
+		case 5: type = OUTPUT_TMDS; break;
+		case 8: type = OUTPUT_DP; break;
+		case 9: type = OUTPUT_DP; break;
+		default:
+			NV_ERROR(dev, "invalid mc, SOR-%d: 0x%08x\n", i, mc);
+			goto ack;
+		}
 
-	NV_DEBUG_KMS(dev, "head %d pxclk: %dKHz\n", head, pclk);
+		or = i;
+	}
 
-	if (dcbent->type != OUTPUT_DP)
-		nouveau_bios_run_display_table(dev, dcbent, 0, -2);
+	if (type == OUTPUT_ANY)
+		goto ack;
 
-	nv50_crtc_set_clock(dev, head, pclk);
+	/* Enable the encoder */
+	for (i = 0; i < dev_priv->vbios.dcb.entries; i++) {
+		dcb = &dev_priv->vbios.dcb.entry[i];
+		if (dcb->type == type && (dcb->or & (1 << or)))
+			break;
+	}
 
-	nouveau_bios_run_display_table(dev, dcbent, script, pclk);
+	if (i == dev_priv->vbios.dcb.entries) {
+		NV_ERROR(dev, "no dcb for %d %d 0x%08x\n", or, type, mc);
+		goto ack;
+	}
 
-	nv50_display_unk20_dp_hack(dev, dcbent);
+	script = nv50_display_script_select(dev, dcb, mc, pclk);
+	nouveau_bios_run_display_table(dev, dcb, script, pclk);
 
-	tmp = nv_rd32(dev, NV50_PDISPLAY_CRTC_CLK_CTRL2(head));
-	tmp &= ~0x000000f;
-	nv_wr32(dev, NV50_PDISPLAY_CRTC_CLK_CTRL2(head), tmp);
+	nv50_display_unk20_dp_hack(dev, dcb);
 
-	if (dcbent->type != OUTPUT_ANALOG) {
+	if (dcb->type != OUTPUT_ANALOG) {
 		tmp = nv_rd32(dev, NV50_PDISPLAY_SOR_CLK_CTRL2(or));
 		tmp &= ~0x00000f0f;
 		if (script & 0x0100)
@@ -853,24 +918,61 @@ nv50_display_unk20_handler(struct drm_device *dev)
 		nv_wr32(dev, NV50_PDISPLAY_DAC_CLK_CTRL2(or), 0);
 	}
 
+	dev_priv->evo_irq.dcb = dcb;
+	dev_priv->evo_irq.pclk = pclk;
+	dev_priv->evo_irq.script = script;
+
 ack:
 	nv_wr32(dev, NV50_PDISPLAY_INTR_1, NV50_PDISPLAY_INTR_1_CLK_UNK20);
 	nv_wr32(dev, 0x610030, 0x80000000);
 }
 
+/* If programming a TMDS output on a SOR that can also be configured for
+ * DisplayPort, make sure NV50_SOR_DP_CTRL_ENABLE is forced off.
+ *
+ * It looks like the VBIOS TMDS scripts make an attempt at this, however,
+ * the VBIOS scripts on at least one board I have only switch it off on
+ * link 0, causing a blank display if the output has previously been
+ * programmed for DisplayPort.
+ */
+static void
+nv50_display_unk40_dp_set_tmds(struct drm_device *dev, struct dcb_entry *dcb)
+{
+	int or = ffs(dcb->or) - 1, link = !(dcb->dpconf.sor.link & 1);
+	struct drm_encoder *encoder;
+	u32 tmp;
+
+	if (dcb->type != OUTPUT_TMDS)
+		return;
+
+	list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
+		struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
+
+		if (nv_encoder->dcb->type == OUTPUT_DP &&
+		    nv_encoder->dcb->or & (1 << or)) {
+			tmp  = nv_rd32(dev, NV50_SOR_DP_CTRL(or, link));
+			tmp &= ~NV50_SOR_DP_CTRL_ENABLED;
+			nv_wr32(dev, NV50_SOR_DP_CTRL(or, link), tmp);
+			break;
+		}
+	}
+}
+
 static void
 nv50_display_unk40_handler(struct drm_device *dev)
 {
-	struct dcb_entry *dcbent;
-	int head, pclk, script, ret;
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct dcb_entry *dcb = dev_priv->evo_irq.dcb;
+	u16 script = dev_priv->evo_irq.script;
+	u32 unk30 = nv_rd32(dev, 0x610030), pclk = dev_priv->evo_irq.pclk;
 
-	ret = nv50_display_irq_head(dev, &head, &dcbent);
-	if (ret)
+	NV_DEBUG_KMS(dev, "0x610030: 0x%08x\n", unk30);
+	dev_priv->evo_irq.dcb = NULL;
+	if (!dcb)
 		goto ack;
-	pclk = nv_rd32(dev, NV50_PDISPLAY_CRTC_P(head, CLOCK)) & 0x3fffff;
-	script = nv50_display_script_select(dev, dcbent, pclk);
 
-	nouveau_bios_run_display_table(dev, dcbent, script, -pclk);
+	nouveau_bios_run_display_table(dev, dcb, script, -pclk);
+	nv50_display_unk40_dp_set_tmds(dev, dcb);
 
 ack:
 	nv_wr32(dev, NV50_PDISPLAY_INTR_1, NV50_PDISPLAY_INTR_1_CLK_UNK40);
diff --git a/drivers/gpu/drm/nouveau/nv50_display.h b/drivers/gpu/drm/nouveau/nv50_display.h
index 581d405..c551f0b 100644
--- a/drivers/gpu/drm/nouveau/nv50_display.h
+++ b/drivers/gpu/drm/nouveau/nv50_display.h
@@ -38,9 +38,11 @@
 void nv50_display_irq_handler(struct drm_device *dev);
 void nv50_display_irq_handler_bh(struct work_struct *work);
 void nv50_display_irq_hotplug_bh(struct work_struct *work);
-int nv50_display_init(struct drm_device *dev);
+int nv50_display_early_init(struct drm_device *dev);
+void nv50_display_late_takedown(struct drm_device *dev);
 int nv50_display_create(struct drm_device *dev);
-int nv50_display_destroy(struct drm_device *dev);
+int nv50_display_init(struct drm_device *dev);
+void nv50_display_destroy(struct drm_device *dev);
 int nv50_crtc_blank(struct nouveau_crtc *, bool blank);
 int nv50_crtc_set_clock(struct drm_device *, int head, int pclk);
 
diff --git a/drivers/gpu/drm/nouveau/nv50_fb.c b/drivers/gpu/drm/nouveau/nv50_fb.c
index 32611bd..cd1988b 100644
--- a/drivers/gpu/drm/nouveau/nv50_fb.c
+++ b/drivers/gpu/drm/nouveau/nv50_fb.c
@@ -20,6 +20,7 @@ nv50_fb_init(struct drm_device *dev)
 	case 0x50:
 		nv_wr32(dev, 0x100c90, 0x0707ff);
 		break;
+	case 0xa3:
 	case 0xa5:
 	case 0xa8:
 		nv_wr32(dev, 0x100c90, 0x0d0fff);
@@ -36,3 +37,42 @@ void
 nv50_fb_takedown(struct drm_device *dev)
 {
 }
+
+void
+nv50_fb_vm_trap(struct drm_device *dev, int display, const char *name)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 trap[6], idx, chinst;
+	int i, ch;
+
+	idx = nv_rd32(dev, 0x100c90);
+	if (!(idx & 0x80000000))
+		return;
+	idx &= 0x00ffffff;
+
+	for (i = 0; i < 6; i++) {
+		nv_wr32(dev, 0x100c90, idx | i << 24);
+		trap[i] = nv_rd32(dev, 0x100c94);
+	}
+	nv_wr32(dev, 0x100c90, idx | 0x80000000);
+
+	if (!display)
+		return;
+
+	chinst = (trap[2] << 16) | trap[1];
+	for (ch = 0; ch < dev_priv->engine.fifo.channels; ch++) {
+		struct nouveau_channel *chan = dev_priv->fifos[ch];
+
+		if (!chan || !chan->ramin)
+			continue;
+
+		if (chinst == chan->ramin->vinst >> 12)
+			break;
+	}
+
+	NV_INFO(dev, "%s - VM: Trapped %s at %02x%04x%04x status %08x "
+		     "channel %d (0x%08x)\n",
+		name, (trap[5] & 0x100 ? "read" : "write"),
+		trap[5] & 0xff, trap[4] & 0xffff, trap[3] & 0xffff,
+		trap[0], ch, chinst);
+}
diff --git a/drivers/gpu/drm/nouveau/nv50_fbcon.c b/drivers/gpu/drm/nouveau/nv50_fbcon.c
index 6bf025c..6dcf048 100644
--- a/drivers/gpu/drm/nouveau/nv50_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nv50_fbcon.c
@@ -1,6 +1,7 @@
 #include "drmP.h"
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
+#include "nouveau_ramht.h"
 #include "nouveau_fbcon.h"
 
 void
@@ -193,7 +194,8 @@ nv50_fbcon_accel_init(struct fb_info *info)
 	if (ret)
 		return ret;
 
-	ret = nouveau_gpuobj_ref_add(dev, dev_priv->channel, Nv2D, eng2d, NULL);
+	ret = nouveau_ramht_insert(dev_priv->channel, Nv2D, eng2d);
+	nouveau_gpuobj_ref(NULL, &eng2d);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/nouveau/nv50_fifo.c b/drivers/gpu/drm/nouveau/nv50_fifo.c
index e20c0e2..a46a961 100644
--- a/drivers/gpu/drm/nouveau/nv50_fifo.c
+++ b/drivers/gpu/drm/nouveau/nv50_fifo.c
@@ -27,42 +27,37 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
-
-struct nv50_fifo_priv {
-	struct nouveau_gpuobj_ref *thingo[2];
-	int cur_thingo;
-};
-
-#define IS_G80 ((dev_priv->chipset & 0xf0) == 0x50)
+#include "nouveau_ramht.h"
 
 static void
-nv50_fifo_init_thingo(struct drm_device *dev)
+nv50_fifo_playlist_update(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nv50_fifo_priv *priv = dev_priv->engine.fifo.priv;
-	struct nouveau_gpuobj_ref *cur;
+	struct nouveau_fifo_engine *pfifo = &dev_priv->engine.fifo;
+	struct nouveau_gpuobj *cur;
 	int i, nr;
 
 	NV_DEBUG(dev, "\n");
 
-	cur = priv->thingo[priv->cur_thingo];
-	priv->cur_thingo = !priv->cur_thingo;
+	cur = pfifo->playlist[pfifo->cur_playlist];
+	pfifo->cur_playlist = !pfifo->cur_playlist;
 
 	/* We never schedule channel 0 or 127 */
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	for (i = 1, nr = 0; i < 127; i++) {
-		if (dev_priv->fifos[i] && dev_priv->fifos[i]->ramfc)
-			nv_wo32(dev, cur->gpuobj, nr++, i);
+		if (dev_priv->fifos[i] && dev_priv->fifos[i]->ramfc) {
+			nv_wo32(cur, (nr * 4), i);
+			nr++;
+		}
 	}
-	dev_priv->engine.instmem.finish_access(dev);
+	dev_priv->engine.instmem.flush(dev);
 
-	nv_wr32(dev, 0x32f4, cur->instance >> 12);
+	nv_wr32(dev, 0x32f4, cur->vinst >> 12);
 	nv_wr32(dev, 0x32ec, nr);
 	nv_wr32(dev, 0x2500, 0x101);
 }
 
-static int
-nv50_fifo_channel_enable(struct drm_device *dev, int channel, bool nt)
+static void
+nv50_fifo_channel_enable(struct drm_device *dev, int channel)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_channel *chan = dev_priv->fifos[channel];
@@ -70,37 +65,28 @@ nv50_fifo_channel_enable(struct drm_device *dev, int channel, bool nt)
 
 	NV_DEBUG(dev, "ch%d\n", channel);
 
-	if (!chan->ramfc)
-		return -EINVAL;
-
-	if (IS_G80)
-		inst = chan->ramfc->instance >> 12;
+	if (dev_priv->chipset == 0x50)
+		inst = chan->ramfc->vinst >> 12;
 	else
-		inst = chan->ramfc->instance >> 8;
-	nv_wr32(dev, NV50_PFIFO_CTX_TABLE(channel),
-		 inst | NV50_PFIFO_CTX_TABLE_CHANNEL_ENABLED);
+		inst = chan->ramfc->vinst >> 8;
 
-	if (!nt)
-		nv50_fifo_init_thingo(dev);
-	return 0;
+	nv_wr32(dev, NV50_PFIFO_CTX_TABLE(channel), inst |
+		     NV50_PFIFO_CTX_TABLE_CHANNEL_ENABLED);
 }
 
 static void
-nv50_fifo_channel_disable(struct drm_device *dev, int channel, bool nt)
+nv50_fifo_channel_disable(struct drm_device *dev, int channel)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	uint32_t inst;
 
-	NV_DEBUG(dev, "ch%d, nt=%d\n", channel, nt);
+	NV_DEBUG(dev, "ch%d\n", channel);
 
-	if (IS_G80)
+	if (dev_priv->chipset == 0x50)
 		inst = NV50_PFIFO_CTX_TABLE_INSTANCE_MASK_G80;
 	else
 		inst = NV50_PFIFO_CTX_TABLE_INSTANCE_MASK_G84;
 	nv_wr32(dev, NV50_PFIFO_CTX_TABLE(channel), inst);
-
-	if (!nt)
-		nv50_fifo_init_thingo(dev);
 }
 
 static void
@@ -133,12 +119,12 @@ nv50_fifo_init_context_table(struct drm_device *dev)
 
 	for (i = 0; i < NV50_PFIFO_CTX_TABLE__SIZE; i++) {
 		if (dev_priv->fifos[i])
-			nv50_fifo_channel_enable(dev, i, true);
+			nv50_fifo_channel_enable(dev, i);
 		else
-			nv50_fifo_channel_disable(dev, i, true);
+			nv50_fifo_channel_disable(dev, i);
 	}
 
-	nv50_fifo_init_thingo(dev);
+	nv50_fifo_playlist_update(dev);
 }
 
 static void
@@ -162,41 +148,38 @@ nv50_fifo_init_regs(struct drm_device *dev)
 	nv_wr32(dev, 0x3270, 0);
 
 	/* Enable dummy channels setup by nv50_instmem.c */
-	nv50_fifo_channel_enable(dev, 0, true);
-	nv50_fifo_channel_enable(dev, 127, true);
+	nv50_fifo_channel_enable(dev, 0);
+	nv50_fifo_channel_enable(dev, 127);
 }
 
 int
 nv50_fifo_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nv50_fifo_priv *priv;
+	struct nouveau_fifo_engine *pfifo = &dev_priv->engine.fifo;
 	int ret;
 
 	NV_DEBUG(dev, "\n");
 
-	priv = dev_priv->engine.fifo.priv;
-	if (priv) {
-		priv->cur_thingo = !priv->cur_thingo;
+	if (pfifo->playlist[0]) {
+		pfifo->cur_playlist = !pfifo->cur_playlist;
 		goto just_reset;
 	}
 
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv)
-		return -ENOMEM;
-	dev_priv->engine.fifo.priv = priv;
-
-	ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0, 128*4, 0x1000,
-				     NVOBJ_FLAG_ZERO_ALLOC, &priv->thingo[0]);
+	ret = nouveau_gpuobj_new(dev, NULL, 128*4, 0x1000,
+				 NVOBJ_FLAG_ZERO_ALLOC,
+				 &pfifo->playlist[0]);
 	if (ret) {
-		NV_ERROR(dev, "error creating thingo0: %d\n", ret);
+		NV_ERROR(dev, "error creating playlist 0: %d\n", ret);
 		return ret;
 	}
 
-	ret = nouveau_gpuobj_new_ref(dev, NULL, NULL, 0, 128*4, 0x1000,
-				     NVOBJ_FLAG_ZERO_ALLOC, &priv->thingo[1]);
+	ret = nouveau_gpuobj_new(dev, NULL, 128*4, 0x1000,
+				 NVOBJ_FLAG_ZERO_ALLOC,
+				 &pfifo->playlist[1]);
 	if (ret) {
-		NV_ERROR(dev, "error creating thingo1: %d\n", ret);
+		nouveau_gpuobj_ref(NULL, &pfifo->playlist[0]);
+		NV_ERROR(dev, "error creating playlist 1: %d\n", ret);
 		return ret;
 	}
 
@@ -216,18 +199,15 @@ void
 nv50_fifo_takedown(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nv50_fifo_priv *priv = dev_priv->engine.fifo.priv;
+	struct nouveau_fifo_engine *pfifo = &dev_priv->engine.fifo;
 
 	NV_DEBUG(dev, "\n");
 
-	if (!priv)
+	if (!pfifo->playlist[0])
 		return;
 
-	nouveau_gpuobj_ref_del(dev, &priv->thingo[0]);
-	nouveau_gpuobj_ref_del(dev, &priv->thingo[1]);
-
-	dev_priv->engine.fifo.priv = NULL;
-	kfree(priv);
+	nouveau_gpuobj_ref(NULL, &pfifo->playlist[0]);
+	nouveau_gpuobj_ref(NULL, &pfifo->playlist[1]);
 }
 
 int
@@ -248,72 +228,61 @@ nv50_fifo_create_context(struct nouveau_channel *chan)
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
-	if (IS_G80) {
-		uint32_t ramin_poffset = chan->ramin->gpuobj->im_pramin->start;
-		uint32_t ramin_voffset = chan->ramin->gpuobj->im_backing_start;
-
-		ret = nouveau_gpuobj_new_fake(dev, ramin_poffset, ramin_voffset,
-					      0x100, NVOBJ_FLAG_ZERO_ALLOC |
-					      NVOBJ_FLAG_ZERO_FREE, &ramfc,
+	if (dev_priv->chipset == 0x50) {
+		ret = nouveau_gpuobj_new_fake(dev, chan->ramin->pinst,
+					      chan->ramin->vinst, 0x100,
+					      NVOBJ_FLAG_ZERO_ALLOC |
+					      NVOBJ_FLAG_ZERO_FREE,
 					      &chan->ramfc);
 		if (ret)
 			return ret;
 
-		ret = nouveau_gpuobj_new_fake(dev, ramin_poffset + 0x0400,
-					      ramin_voffset + 0x0400, 4096,
-					      0, NULL, &chan->cache);
+		ret = nouveau_gpuobj_new_fake(dev, chan->ramin->pinst + 0x0400,
+					      chan->ramin->vinst + 0x0400,
+					      4096, 0, &chan->cache);
 		if (ret)
 			return ret;
 	} else {
-		ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, 0x100, 256,
-					     NVOBJ_FLAG_ZERO_ALLOC |
-					     NVOBJ_FLAG_ZERO_FREE,
-					     &chan->ramfc);
+		ret = nouveau_gpuobj_new(dev, chan, 0x100, 256,
+					 NVOBJ_FLAG_ZERO_ALLOC |
+					 NVOBJ_FLAG_ZERO_FREE, &chan->ramfc);
 		if (ret)
 			return ret;
-		ramfc = chan->ramfc->gpuobj;
 
-		ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, 4096, 1024,
-					     0, &chan->cache);
+		ret = nouveau_gpuobj_new(dev, chan, 4096, 1024,
+					 0, &chan->cache);
 		if (ret)
 			return ret;
 	}
+	ramfc = chan->ramfc;
 
 	spin_lock_irqsave(&dev_priv->context_switch_lock, flags);
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
-
-	nv_wo32(dev, ramfc, 0x48/4, chan->pushbuf->instance >> 4);
-	nv_wo32(dev, ramfc, 0x80/4, (0xc << 24) | (chan->ramht->instance >> 4));
-	nv_wo32(dev, ramfc, 0x44/4, 0x2101ffff);
-	nv_wo32(dev, ramfc, 0x60/4, 0x7fffffff);
-	nv_wo32(dev, ramfc, 0x40/4, 0x00000000);
-	nv_wo32(dev, ramfc, 0x7c/4, 0x30000001);
-	nv_wo32(dev, ramfc, 0x78/4, 0x00000000);
-	nv_wo32(dev, ramfc, 0x3c/4, 0x403f6078);
-	nv_wo32(dev, ramfc, 0x50/4, chan->pushbuf_base +
-				    chan->dma.ib_base * 4);
-	nv_wo32(dev, ramfc, 0x54/4, drm_order(chan->dma.ib_max + 1) << 16);
-
-	if (!IS_G80) {
-		nv_wo32(dev, chan->ramin->gpuobj, 0, chan->id);
-		nv_wo32(dev, chan->ramin->gpuobj, 1,
-						chan->ramfc->instance >> 8);
-
-		nv_wo32(dev, ramfc, 0x88/4, chan->cache->instance >> 10);
-		nv_wo32(dev, ramfc, 0x98/4, chan->ramin->instance >> 12);
+	nv_wo32(ramfc, 0x48, chan->pushbuf->cinst >> 4);
+	nv_wo32(ramfc, 0x80, ((chan->ramht->bits - 9) << 27) |
+			     (4 << 24) /* SEARCH_FULL */ |
+			     (chan->ramht->gpuobj->cinst >> 4));
+	nv_wo32(ramfc, 0x44, 0x2101ffff);
+	nv_wo32(ramfc, 0x60, 0x7fffffff);
+	nv_wo32(ramfc, 0x40, 0x00000000);
+	nv_wo32(ramfc, 0x7c, 0x30000001);
+	nv_wo32(ramfc, 0x78, 0x00000000);
+	nv_wo32(ramfc, 0x3c, 0x403f6078);
+	nv_wo32(ramfc, 0x50, chan->pushbuf_base + chan->dma.ib_base * 4);
+	nv_wo32(ramfc, 0x54, drm_order(chan->dma.ib_max + 1) << 16);
+
+	if (dev_priv->chipset != 0x50) {
+		nv_wo32(chan->ramin, 0, chan->id);
+		nv_wo32(chan->ramin, 4, chan->ramfc->vinst >> 8);
+
+		nv_wo32(ramfc, 0x88, chan->cache->vinst >> 10);
+		nv_wo32(ramfc, 0x98, chan->ramin->vinst >> 12);
 	}
 
-	dev_priv->engine.instmem.finish_access(dev);
-
-	ret = nv50_fifo_channel_enable(dev, chan->id, false);
-	if (ret) {
-		NV_ERROR(dev, "error enabling ch%d: %d\n", chan->id, ret);
-		spin_unlock_irqrestore(&dev_priv->context_switch_lock, flags);
-		nouveau_gpuobj_ref_del(dev, &chan->ramfc);
-		return ret;
-	}
+	dev_priv->engine.instmem.flush(dev);
 
+	nv50_fifo_channel_enable(dev, chan->id);
+	nv50_fifo_playlist_update(dev);
 	spin_unlock_irqrestore(&dev_priv->context_switch_lock, flags);
 	return 0;
 }
@@ -322,20 +291,22 @@ void
 nv50_fifo_destroy_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
-	struct nouveau_gpuobj_ref *ramfc = chan->ramfc;
+	struct nouveau_gpuobj *ramfc = NULL;
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
 	/* This will ensure the channel is seen as disabled. */
-	chan->ramfc = NULL;
-	nv50_fifo_channel_disable(dev, chan->id, false);
+	nouveau_gpuobj_ref(chan->ramfc, &ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->ramfc);
+	nv50_fifo_channel_disable(dev, chan->id);
 
 	/* Dummy channel, also used on ch 127 */
 	if (chan->id == 0)
-		nv50_fifo_channel_disable(dev, 127, false);
+		nv50_fifo_channel_disable(dev, 127);
+	nv50_fifo_playlist_update(dev);
 
-	nouveau_gpuobj_ref_del(dev, &ramfc);
-	nouveau_gpuobj_ref_del(dev, &chan->cache);
+	nouveau_gpuobj_ref(NULL, &ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->cache);
 }
 
 int
@@ -343,69 +314,65 @@ nv50_fifo_load_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *ramfc = chan->ramfc->gpuobj;
-	struct nouveau_gpuobj *cache = chan->cache->gpuobj;
+	struct nouveau_gpuobj *ramfc = chan->ramfc;
+	struct nouveau_gpuobj *cache = chan->cache;
 	int ptr, cnt;
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
-	dev_priv->engine.instmem.prepare_access(dev, false);
-
-	nv_wr32(dev, 0x3330, nv_ro32(dev, ramfc, 0x00/4));
-	nv_wr32(dev, 0x3334, nv_ro32(dev, ramfc, 0x04/4));
-	nv_wr32(dev, 0x3240, nv_ro32(dev, ramfc, 0x08/4));
-	nv_wr32(dev, 0x3320, nv_ro32(dev, ramfc, 0x0c/4));
-	nv_wr32(dev, 0x3244, nv_ro32(dev, ramfc, 0x10/4));
-	nv_wr32(dev, 0x3328, nv_ro32(dev, ramfc, 0x14/4));
-	nv_wr32(dev, 0x3368, nv_ro32(dev, ramfc, 0x18/4));
-	nv_wr32(dev, 0x336c, nv_ro32(dev, ramfc, 0x1c/4));
-	nv_wr32(dev, 0x3370, nv_ro32(dev, ramfc, 0x20/4));
-	nv_wr32(dev, 0x3374, nv_ro32(dev, ramfc, 0x24/4));
-	nv_wr32(dev, 0x3378, nv_ro32(dev, ramfc, 0x28/4));
-	nv_wr32(dev, 0x337c, nv_ro32(dev, ramfc, 0x2c/4));
-	nv_wr32(dev, 0x3228, nv_ro32(dev, ramfc, 0x30/4));
-	nv_wr32(dev, 0x3364, nv_ro32(dev, ramfc, 0x34/4));
-	nv_wr32(dev, 0x32a0, nv_ro32(dev, ramfc, 0x38/4));
-	nv_wr32(dev, 0x3224, nv_ro32(dev, ramfc, 0x3c/4));
-	nv_wr32(dev, 0x324c, nv_ro32(dev, ramfc, 0x40/4));
-	nv_wr32(dev, 0x2044, nv_ro32(dev, ramfc, 0x44/4));
-	nv_wr32(dev, 0x322c, nv_ro32(dev, ramfc, 0x48/4));
-	nv_wr32(dev, 0x3234, nv_ro32(dev, ramfc, 0x4c/4));
-	nv_wr32(dev, 0x3340, nv_ro32(dev, ramfc, 0x50/4));
-	nv_wr32(dev, 0x3344, nv_ro32(dev, ramfc, 0x54/4));
-	nv_wr32(dev, 0x3280, nv_ro32(dev, ramfc, 0x58/4));
-	nv_wr32(dev, 0x3254, nv_ro32(dev, ramfc, 0x5c/4));
-	nv_wr32(dev, 0x3260, nv_ro32(dev, ramfc, 0x60/4));
-	nv_wr32(dev, 0x3264, nv_ro32(dev, ramfc, 0x64/4));
-	nv_wr32(dev, 0x3268, nv_ro32(dev, ramfc, 0x68/4));
-	nv_wr32(dev, 0x326c, nv_ro32(dev, ramfc, 0x6c/4));
-	nv_wr32(dev, 0x32e4, nv_ro32(dev, ramfc, 0x70/4));
-	nv_wr32(dev, 0x3248, nv_ro32(dev, ramfc, 0x74/4));
-	nv_wr32(dev, 0x2088, nv_ro32(dev, ramfc, 0x78/4));
-	nv_wr32(dev, 0x2058, nv_ro32(dev, ramfc, 0x7c/4));
-	nv_wr32(dev, 0x2210, nv_ro32(dev, ramfc, 0x80/4));
-
-	cnt = nv_ro32(dev, ramfc, 0x84/4);
+	nv_wr32(dev, 0x3330, nv_ro32(ramfc, 0x00));
+	nv_wr32(dev, 0x3334, nv_ro32(ramfc, 0x04));
+	nv_wr32(dev, 0x3240, nv_ro32(ramfc, 0x08));
+	nv_wr32(dev, 0x3320, nv_ro32(ramfc, 0x0c));
+	nv_wr32(dev, 0x3244, nv_ro32(ramfc, 0x10));
+	nv_wr32(dev, 0x3328, nv_ro32(ramfc, 0x14));
+	nv_wr32(dev, 0x3368, nv_ro32(ramfc, 0x18));
+	nv_wr32(dev, 0x336c, nv_ro32(ramfc, 0x1c));
+	nv_wr32(dev, 0x3370, nv_ro32(ramfc, 0x20));
+	nv_wr32(dev, 0x3374, nv_ro32(ramfc, 0x24));
+	nv_wr32(dev, 0x3378, nv_ro32(ramfc, 0x28));
+	nv_wr32(dev, 0x337c, nv_ro32(ramfc, 0x2c));
+	nv_wr32(dev, 0x3228, nv_ro32(ramfc, 0x30));
+	nv_wr32(dev, 0x3364, nv_ro32(ramfc, 0x34));
+	nv_wr32(dev, 0x32a0, nv_ro32(ramfc, 0x38));
+	nv_wr32(dev, 0x3224, nv_ro32(ramfc, 0x3c));
+	nv_wr32(dev, 0x324c, nv_ro32(ramfc, 0x40));
+	nv_wr32(dev, 0x2044, nv_ro32(ramfc, 0x44));
+	nv_wr32(dev, 0x322c, nv_ro32(ramfc, 0x48));
+	nv_wr32(dev, 0x3234, nv_ro32(ramfc, 0x4c));
+	nv_wr32(dev, 0x3340, nv_ro32(ramfc, 0x50));
+	nv_wr32(dev, 0x3344, nv_ro32(ramfc, 0x54));
+	nv_wr32(dev, 0x3280, nv_ro32(ramfc, 0x58));
+	nv_wr32(dev, 0x3254, nv_ro32(ramfc, 0x5c));
+	nv_wr32(dev, 0x3260, nv_ro32(ramfc, 0x60));
+	nv_wr32(dev, 0x3264, nv_ro32(ramfc, 0x64));
+	nv_wr32(dev, 0x3268, nv_ro32(ramfc, 0x68));
+	nv_wr32(dev, 0x326c, nv_ro32(ramfc, 0x6c));
+	nv_wr32(dev, 0x32e4, nv_ro32(ramfc, 0x70));
+	nv_wr32(dev, 0x3248, nv_ro32(ramfc, 0x74));
+	nv_wr32(dev, 0x2088, nv_ro32(ramfc, 0x78));
+	nv_wr32(dev, 0x2058, nv_ro32(ramfc, 0x7c));
+	nv_wr32(dev, 0x2210, nv_ro32(ramfc, 0x80));
+
+	cnt = nv_ro32(ramfc, 0x84);
 	for (ptr = 0; ptr < cnt; ptr++) {
 		nv_wr32(dev, NV40_PFIFO_CACHE1_METHOD(ptr),
-			nv_ro32(dev, cache, (ptr * 2) + 0));
+			nv_ro32(cache, (ptr * 8) + 0));
 		nv_wr32(dev, NV40_PFIFO_CACHE1_DATA(ptr),
-			nv_ro32(dev, cache, (ptr * 2) + 1));
+			nv_ro32(cache, (ptr * 8) + 4));
 	}
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUT, cnt << 2);
 	nv_wr32(dev, NV03_PFIFO_CACHE1_GET, 0);
 
 	/* guessing that all the 0x34xx regs aren't on NV50 */
-	if (!IS_G80) {
-		nv_wr32(dev, 0x340c, nv_ro32(dev, ramfc, 0x88/4));
-		nv_wr32(dev, 0x3400, nv_ro32(dev, ramfc, 0x8c/4));
-		nv_wr32(dev, 0x3404, nv_ro32(dev, ramfc, 0x90/4));
-		nv_wr32(dev, 0x3408, nv_ro32(dev, ramfc, 0x94/4));
-		nv_wr32(dev, 0x3410, nv_ro32(dev, ramfc, 0x98/4));
+	if (dev_priv->chipset != 0x50) {
+		nv_wr32(dev, 0x340c, nv_ro32(ramfc, 0x88));
+		nv_wr32(dev, 0x3400, nv_ro32(ramfc, 0x8c));
+		nv_wr32(dev, 0x3404, nv_ro32(ramfc, 0x90));
+		nv_wr32(dev, 0x3408, nv_ro32(ramfc, 0x94));
+		nv_wr32(dev, 0x3410, nv_ro32(ramfc, 0x98));
 	}
 
-	dev_priv->engine.instmem.finish_access(dev);
-
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUSH1, chan->id | (1<<16));
 	return 0;
 }
@@ -431,67 +398,66 @@ nv50_fifo_unload_context(struct drm_device *dev)
 		return -EINVAL;
 	}
 	NV_DEBUG(dev, "ch%d\n", chan->id);
-	ramfc = chan->ramfc->gpuobj;
-	cache = chan->cache->gpuobj;
-
-	dev_priv->engine.instmem.prepare_access(dev, true);
-
-	nv_wo32(dev, ramfc, 0x00/4, nv_rd32(dev, 0x3330));
-	nv_wo32(dev, ramfc, 0x04/4, nv_rd32(dev, 0x3334));
-	nv_wo32(dev, ramfc, 0x08/4, nv_rd32(dev, 0x3240));
-	nv_wo32(dev, ramfc, 0x0c/4, nv_rd32(dev, 0x3320));
-	nv_wo32(dev, ramfc, 0x10/4, nv_rd32(dev, 0x3244));
-	nv_wo32(dev, ramfc, 0x14/4, nv_rd32(dev, 0x3328));
-	nv_wo32(dev, ramfc, 0x18/4, nv_rd32(dev, 0x3368));
-	nv_wo32(dev, ramfc, 0x1c/4, nv_rd32(dev, 0x336c));
-	nv_wo32(dev, ramfc, 0x20/4, nv_rd32(dev, 0x3370));
-	nv_wo32(dev, ramfc, 0x24/4, nv_rd32(dev, 0x3374));
-	nv_wo32(dev, ramfc, 0x28/4, nv_rd32(dev, 0x3378));
-	nv_wo32(dev, ramfc, 0x2c/4, nv_rd32(dev, 0x337c));
-	nv_wo32(dev, ramfc, 0x30/4, nv_rd32(dev, 0x3228));
-	nv_wo32(dev, ramfc, 0x34/4, nv_rd32(dev, 0x3364));
-	nv_wo32(dev, ramfc, 0x38/4, nv_rd32(dev, 0x32a0));
-	nv_wo32(dev, ramfc, 0x3c/4, nv_rd32(dev, 0x3224));
-	nv_wo32(dev, ramfc, 0x40/4, nv_rd32(dev, 0x324c));
-	nv_wo32(dev, ramfc, 0x44/4, nv_rd32(dev, 0x2044));
-	nv_wo32(dev, ramfc, 0x48/4, nv_rd32(dev, 0x322c));
-	nv_wo32(dev, ramfc, 0x4c/4, nv_rd32(dev, 0x3234));
-	nv_wo32(dev, ramfc, 0x50/4, nv_rd32(dev, 0x3340));
-	nv_wo32(dev, ramfc, 0x54/4, nv_rd32(dev, 0x3344));
-	nv_wo32(dev, ramfc, 0x58/4, nv_rd32(dev, 0x3280));
-	nv_wo32(dev, ramfc, 0x5c/4, nv_rd32(dev, 0x3254));
-	nv_wo32(dev, ramfc, 0x60/4, nv_rd32(dev, 0x3260));
-	nv_wo32(dev, ramfc, 0x64/4, nv_rd32(dev, 0x3264));
-	nv_wo32(dev, ramfc, 0x68/4, nv_rd32(dev, 0x3268));
-	nv_wo32(dev, ramfc, 0x6c/4, nv_rd32(dev, 0x326c));
-	nv_wo32(dev, ramfc, 0x70/4, nv_rd32(dev, 0x32e4));
-	nv_wo32(dev, ramfc, 0x74/4, nv_rd32(dev, 0x3248));
-	nv_wo32(dev, ramfc, 0x78/4, nv_rd32(dev, 0x2088));
-	nv_wo32(dev, ramfc, 0x7c/4, nv_rd32(dev, 0x2058));
-	nv_wo32(dev, ramfc, 0x80/4, nv_rd32(dev, 0x2210));
+	ramfc = chan->ramfc;
+	cache = chan->cache;
+
+	nv_wo32(ramfc, 0x00, nv_rd32(dev, 0x3330));
+	nv_wo32(ramfc, 0x04, nv_rd32(dev, 0x3334));
+	nv_wo32(ramfc, 0x08, nv_rd32(dev, 0x3240));
+	nv_wo32(ramfc, 0x0c, nv_rd32(dev, 0x3320));
+	nv_wo32(ramfc, 0x10, nv_rd32(dev, 0x3244));
+	nv_wo32(ramfc, 0x14, nv_rd32(dev, 0x3328));
+	nv_wo32(ramfc, 0x18, nv_rd32(dev, 0x3368));
+	nv_wo32(ramfc, 0x1c, nv_rd32(dev, 0x336c));
+	nv_wo32(ramfc, 0x20, nv_rd32(dev, 0x3370));
+	nv_wo32(ramfc, 0x24, nv_rd32(dev, 0x3374));
+	nv_wo32(ramfc, 0x28, nv_rd32(dev, 0x3378));
+	nv_wo32(ramfc, 0x2c, nv_rd32(dev, 0x337c));
+	nv_wo32(ramfc, 0x30, nv_rd32(dev, 0x3228));
+	nv_wo32(ramfc, 0x34, nv_rd32(dev, 0x3364));
+	nv_wo32(ramfc, 0x38, nv_rd32(dev, 0x32a0));
+	nv_wo32(ramfc, 0x3c, nv_rd32(dev, 0x3224));
+	nv_wo32(ramfc, 0x40, nv_rd32(dev, 0x324c));
+	nv_wo32(ramfc, 0x44, nv_rd32(dev, 0x2044));
+	nv_wo32(ramfc, 0x48, nv_rd32(dev, 0x322c));
+	nv_wo32(ramfc, 0x4c, nv_rd32(dev, 0x3234));
+	nv_wo32(ramfc, 0x50, nv_rd32(dev, 0x3340));
+	nv_wo32(ramfc, 0x54, nv_rd32(dev, 0x3344));
+	nv_wo32(ramfc, 0x58, nv_rd32(dev, 0x3280));
+	nv_wo32(ramfc, 0x5c, nv_rd32(dev, 0x3254));
+	nv_wo32(ramfc, 0x60, nv_rd32(dev, 0x3260));
+	nv_wo32(ramfc, 0x64, nv_rd32(dev, 0x3264));
+	nv_wo32(ramfc, 0x68, nv_rd32(dev, 0x3268));
+	nv_wo32(ramfc, 0x6c, nv_rd32(dev, 0x326c));
+	nv_wo32(ramfc, 0x70, nv_rd32(dev, 0x32e4));
+	nv_wo32(ramfc, 0x74, nv_rd32(dev, 0x3248));
+	nv_wo32(ramfc, 0x78, nv_rd32(dev, 0x2088));
+	nv_wo32(ramfc, 0x7c, nv_rd32(dev, 0x2058));
+	nv_wo32(ramfc, 0x80, nv_rd32(dev, 0x2210));
 
 	put = (nv_rd32(dev, NV03_PFIFO_CACHE1_PUT) & 0x7ff) >> 2;
 	get = (nv_rd32(dev, NV03_PFIFO_CACHE1_GET) & 0x7ff) >> 2;
 	ptr = 0;
 	while (put != get) {
-		nv_wo32(dev, cache, ptr++,
-			    nv_rd32(dev, NV40_PFIFO_CACHE1_METHOD(get)));
-		nv_wo32(dev, cache, ptr++,
-			    nv_rd32(dev, NV40_PFIFO_CACHE1_DATA(get)));
+		nv_wo32(cache, ptr + 0,
+			nv_rd32(dev, NV40_PFIFO_CACHE1_METHOD(get)));
+		nv_wo32(cache, ptr + 4,
+			nv_rd32(dev, NV40_PFIFO_CACHE1_DATA(get)));
 		get = (get + 1) & 0x1ff;
+		ptr += 8;
 	}
 
 	/* guessing that all the 0x34xx regs aren't on NV50 */
-	if (!IS_G80) {
-		nv_wo32(dev, ramfc, 0x84/4, ptr >> 1);
-		nv_wo32(dev, ramfc, 0x88/4, nv_rd32(dev, 0x340c));
-		nv_wo32(dev, ramfc, 0x8c/4, nv_rd32(dev, 0x3400));
-		nv_wo32(dev, ramfc, 0x90/4, nv_rd32(dev, 0x3404));
-		nv_wo32(dev, ramfc, 0x94/4, nv_rd32(dev, 0x3408));
-		nv_wo32(dev, ramfc, 0x98/4, nv_rd32(dev, 0x3410));
+	if (dev_priv->chipset != 0x50) {
+		nv_wo32(ramfc, 0x84, ptr >> 3);
+		nv_wo32(ramfc, 0x88, nv_rd32(dev, 0x340c));
+		nv_wo32(ramfc, 0x8c, nv_rd32(dev, 0x3400));
+		nv_wo32(ramfc, 0x90, nv_rd32(dev, 0x3404));
+		nv_wo32(ramfc, 0x94, nv_rd32(dev, 0x3408));
+		nv_wo32(ramfc, 0x98, nv_rd32(dev, 0x3410));
 	}
 
-	dev_priv->engine.instmem.finish_access(dev);
+	dev_priv->engine.instmem.flush(dev);
 
 	/*XXX: probably reload ch127 (NULL) state back too */
 	nv_wr32(dev, NV03_PFIFO_CACHE1_PUSH1, 127);
diff --git a/drivers/gpu/drm/nouveau/nv50_gpio.c b/drivers/gpu/drm/nouveau/nv50_gpio.c
index bb47ad7..b2fab2b 100644
--- a/drivers/gpu/drm/nouveau/nv50_gpio.c
+++ b/drivers/gpu/drm/nouveau/nv50_gpio.c
@@ -74,3 +74,38 @@ nv50_gpio_set(struct drm_device *dev, enum dcb_gpio_tag tag, int state)
 	nv_wr32(dev, r, v);
 	return 0;
 }
+
+void
+nv50_gpio_irq_enable(struct drm_device *dev, enum dcb_gpio_tag tag, bool on)
+{
+	struct dcb_gpio_entry *gpio;
+	u32 reg, mask;
+
+	gpio = nouveau_bios_gpio_entry(dev, tag);
+	if (!gpio) {
+		NV_ERROR(dev, "gpio tag 0x%02x not found\n", tag);
+		return;
+	}
+
+	reg  = gpio->line < 16 ? 0xe050 : 0xe070;
+	mask = 0x00010001 << (gpio->line & 0xf);
+
+	nv_wr32(dev, reg + 4, mask);
+	nv_mask(dev, reg + 0, mask, on ? mask : 0);
+}
+
+int
+nv50_gpio_init(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	/* disable, and ack any pending gpio interrupts */
+	nv_wr32(dev, 0xe050, 0x00000000);
+	nv_wr32(dev, 0xe054, 0xffffffff);
+	if (dev_priv->chipset >= 0x90) {
+		nv_wr32(dev, 0xe070, 0x00000000);
+		nv_wr32(dev, 0xe074, 0xffffffff);
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nv50_graph.c b/drivers/gpu/drm/nouveau/nv50_graph.c
index b203d06..cbf5ae2 100644
--- a/drivers/gpu/drm/nouveau/nv50_graph.c
+++ b/drivers/gpu/drm/nouveau/nv50_graph.c
@@ -27,11 +27,9 @@
 #include "drmP.h"
 #include "drm.h"
 #include "nouveau_drv.h"
-
+#include "nouveau_ramht.h"
 #include "nouveau_grctx.h"
 
-#define IS_G80 ((dev_priv->chipset & 0xf0) == 0x50)
-
 static void
 nv50_graph_init_reset(struct drm_device *dev)
 {
@@ -103,37 +101,33 @@ static int
 nv50_graph_init_ctxctl(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	struct nouveau_grctx ctx = {};
+	uint32_t *cp;
+	int i;
 
 	NV_DEBUG(dev, "\n");
 
-	if (nouveau_ctxfw) {
-		nouveau_grctx_prog_load(dev);
-		dev_priv->engine.graph.grctx_size = 0x70000;
+	cp = kmalloc(512 * 4, GFP_KERNEL);
+	if (!cp) {
+		NV_ERROR(dev, "failed to allocate ctxprog\n");
+		dev_priv->engine.graph.accel_blocked = true;
+		return 0;
 	}
-	if (!dev_priv->engine.graph.ctxprog) {
-		struct nouveau_grctx ctx = {};
-		uint32_t *cp = kmalloc(512 * 4, GFP_KERNEL);
-		int i;
-		if (!cp) {
-			NV_ERROR(dev, "Couldn't alloc ctxprog! Disabling acceleration.\n");
-			dev_priv->engine.graph.accel_blocked = true;
-			return 0;
-		}
-		ctx.dev = dev;
-		ctx.mode = NOUVEAU_GRCTX_PROG;
-		ctx.data = cp;
-		ctx.ctxprog_max = 512;
-		if (!nv50_grctx_init(&ctx)) {
-			dev_priv->engine.graph.grctx_size = ctx.ctxvals_pos * 4;
-
-			nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_INDEX, 0);
-			for (i = 0; i < ctx.ctxprog_len; i++)
-				nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_DATA, cp[i]);
-		} else {
-			dev_priv->engine.graph.accel_blocked = true;
-		}
-		kfree(cp);
+
+	ctx.dev = dev;
+	ctx.mode = NOUVEAU_GRCTX_PROG;
+	ctx.data = cp;
+	ctx.ctxprog_max = 512;
+	if (!nv50_grctx_init(&ctx)) {
+		dev_priv->engine.graph.grctx_size = ctx.ctxvals_pos * 4;
+
+		nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_INDEX, 0);
+		for (i = 0; i < ctx.ctxprog_len; i++)
+			nv_wr32(dev, NV40_PGRAPH_CTXCTL_UCODE_DATA, cp[i]);
+	} else {
+		dev_priv->engine.graph.accel_blocked = true;
 	}
+	kfree(cp);
 
 	nv_wr32(dev, 0x400320, 4);
 	nv_wr32(dev, NV40_PGRAPH_CTXCTL_CUR, 0);
@@ -164,7 +158,6 @@ void
 nv50_graph_takedown(struct drm_device *dev)
 {
 	NV_DEBUG(dev, "\n");
-	nouveau_grctx_fini(dev);
 }
 
 void
@@ -188,7 +181,7 @@ nv50_graph_channel(struct drm_device *dev)
 	/* Be sure we're not in the middle of a context switch or bad things
 	 * will happen, such as unloading the wrong pgraph context.
 	 */
-	if (!nv_wait(0x400300, 0x00000001, 0x00000000))
+	if (!nv_wait(dev, 0x400300, 0x00000001, 0x00000000))
 		NV_ERROR(dev, "Ctxprog is still running\n");
 
 	inst = nv_rd32(dev, NV50_PGRAPH_CTXCTL_CUR);
@@ -199,7 +192,7 @@ nv50_graph_channel(struct drm_device *dev)
 	for (i = 0; i < dev_priv->engine.fifo.channels; i++) {
 		struct nouveau_channel *chan = dev_priv->fifos[i];
 
-		if (chan && chan->ramin && chan->ramin->instance == inst)
+		if (chan && chan->ramin && chan->ramin->vinst == inst)
 			return chan;
 	}
 
@@ -211,44 +204,36 @@ nv50_graph_create_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_gpuobj *ramin = chan->ramin->gpuobj;
-	struct nouveau_gpuobj *ctx;
+	struct nouveau_gpuobj *ramin = chan->ramin;
 	struct nouveau_pgraph_engine *pgraph = &dev_priv->engine.graph;
+	struct nouveau_grctx ctx = {};
 	int hdr, ret;
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
-	ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, pgraph->grctx_size,
-				     0x1000, NVOBJ_FLAG_ZERO_ALLOC |
-				     NVOBJ_FLAG_ZERO_FREE, &chan->ramin_grctx);
+	ret = nouveau_gpuobj_new(dev, chan, pgraph->grctx_size, 0x1000,
+				 NVOBJ_FLAG_ZERO_ALLOC |
+				 NVOBJ_FLAG_ZERO_FREE, &chan->ramin_grctx);
 	if (ret)
 		return ret;
-	ctx = chan->ramin_grctx->gpuobj;
-
-	hdr = IS_G80 ? 0x200 : 0x20;
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	nv_wo32(dev, ramin, (hdr + 0x00)/4, 0x00190002);
-	nv_wo32(dev, ramin, (hdr + 0x04)/4, chan->ramin_grctx->instance +
-					   pgraph->grctx_size - 1);
-	nv_wo32(dev, ramin, (hdr + 0x08)/4, chan->ramin_grctx->instance);
-	nv_wo32(dev, ramin, (hdr + 0x0c)/4, 0);
-	nv_wo32(dev, ramin, (hdr + 0x10)/4, 0);
-	nv_wo32(dev, ramin, (hdr + 0x14)/4, 0x00010000);
-	dev_priv->engine.instmem.finish_access(dev);
-
-	dev_priv->engine.instmem.prepare_access(dev, true);
-	if (!pgraph->ctxprog) {
-		struct nouveau_grctx ctx = {};
-		ctx.dev = chan->dev;
-		ctx.mode = NOUVEAU_GRCTX_VALS;
-		ctx.data = chan->ramin_grctx->gpuobj;
-		nv50_grctx_init(&ctx);
-	} else {
-		nouveau_grctx_vals_load(dev, ctx);
-	}
-	nv_wo32(dev, ctx, 0x00000/4, chan->ramin->instance >> 12);
-	dev_priv->engine.instmem.finish_access(dev);
 
+	hdr = (dev_priv->chipset == 0x50) ? 0x200 : 0x20;
+	nv_wo32(ramin, hdr + 0x00, 0x00190002);
+	nv_wo32(ramin, hdr + 0x04, chan->ramin_grctx->vinst +
+				   pgraph->grctx_size - 1);
+	nv_wo32(ramin, hdr + 0x08, chan->ramin_grctx->vinst);
+	nv_wo32(ramin, hdr + 0x0c, 0);
+	nv_wo32(ramin, hdr + 0x10, 0);
+	nv_wo32(ramin, hdr + 0x14, 0x00010000);
+
+	ctx.dev = chan->dev;
+	ctx.mode = NOUVEAU_GRCTX_VALS;
+	ctx.data = chan->ramin_grctx;
+	nv50_grctx_init(&ctx);
+
+	nv_wo32(chan->ramin_grctx, 0x00000, chan->ramin->vinst >> 12);
+
+	dev_priv->engine.instmem.flush(dev);
 	return 0;
 }
 
@@ -257,19 +242,18 @@ nv50_graph_destroy_context(struct nouveau_channel *chan)
 {
 	struct drm_device *dev = chan->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	int i, hdr = IS_G80 ? 0x200 : 0x20;
+	int i, hdr = (dev_priv->chipset == 0x50) ? 0x200 : 0x20;
 
 	NV_DEBUG(dev, "ch%d\n", chan->id);
 
-	if (!chan->ramin || !chan->ramin->gpuobj)
+	if (!chan->ramin)
 		return;
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	for (i = hdr; i < hdr + 24; i += 4)
-		nv_wo32(dev, chan->ramin->gpuobj, i/4, 0);
-	dev_priv->engine.instmem.finish_access(dev);
+		nv_wo32(chan->ramin, i, 0);
+	dev_priv->engine.instmem.flush(dev);
 
-	nouveau_gpuobj_ref_del(dev, &chan->ramin_grctx);
+	nouveau_gpuobj_ref(NULL, &chan->ramin_grctx);
 }
 
 static int
@@ -296,7 +280,7 @@ nv50_graph_do_load_context(struct drm_device *dev, uint32_t inst)
 int
 nv50_graph_load_context(struct nouveau_channel *chan)
 {
-	uint32_t inst = chan->ramin->instance >> 12;
+	uint32_t inst = chan->ramin->vinst >> 12;
 
 	NV_DEBUG(chan->dev, "ch%d\n", chan->id);
 	return nv50_graph_do_load_context(chan->dev, inst);
@@ -341,15 +325,16 @@ static int
 nv50_graph_nvsw_dma_vblsem(struct nouveau_channel *chan, int grclass,
 			   int mthd, uint32_t data)
 {
-	struct nouveau_gpuobj_ref *ref = NULL;
+	struct nouveau_gpuobj *gpuobj;
 
-	if (nouveau_gpuobj_ref_find(chan, data, &ref))
+	gpuobj = nouveau_ramht_find(chan, data);
+	if (!gpuobj)
 		return -ENOENT;
 
-	if (nouveau_notifier_offset(ref->gpuobj, NULL))
+	if (nouveau_notifier_offset(gpuobj, NULL))
 		return -EINVAL;
 
-	chan->nvsw.vblsem = ref->gpuobj;
+	chan->nvsw.vblsem = gpuobj;
 	chan->nvsw.vblsem_offset = ~0;
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nv50_grctx.c b/drivers/gpu/drm/nouveau/nv50_grctx.c
index 42a8fb2..336aab2 100644
--- a/drivers/gpu/drm/nouveau/nv50_grctx.c
+++ b/drivers/gpu/drm/nouveau/nv50_grctx.c
@@ -103,6 +103,9 @@
 #include "nouveau_drv.h"
 #include "nouveau_grctx.h"
 
+#define IS_NVA3F(x) (((x) > 0xa0 && (x) < 0xaa) || (x) == 0xaf)
+#define IS_NVAAF(x) ((x) >= 0xaa && (x) <= 0xac)
+
 /*
  * This code deals with PGRAPH contexts on NV50 family cards. Like NV40, it's
  * the GPU itself that does context-switching, but it needs a special
@@ -182,6 +185,7 @@ nv50_grctx_init(struct nouveau_grctx *ctx)
 	case 0xa8:
 	case 0xaa:
 	case 0xac:
+	case 0xaf:
 		break;
 	default:
 		NV_ERROR(ctx->dev, "I don't know how to make a ctxprog for "
@@ -268,6 +272,9 @@ nv50_grctx_init(struct nouveau_grctx *ctx)
  */
 
 static void
+nv50_graph_construct_mmio_ddata(struct nouveau_grctx *ctx);
+
+static void
 nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
@@ -286,7 +293,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 		gr_def(ctx, 0x400840, 0xffe806a8);
 	}
 	gr_def(ctx, 0x400844, 0x00000002);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
+	if (IS_NVA3F(dev_priv->chipset))
 		gr_def(ctx, 0x400894, 0x00001000);
 	gr_def(ctx, 0x4008e8, 0x00000003);
 	gr_def(ctx, 0x4008ec, 0x00001000);
@@ -299,13 +306,15 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 
 	if (dev_priv->chipset >= 0xa0)
 		cp_ctx(ctx, 0x400b00, 0x1);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
+	if (IS_NVA3F(dev_priv->chipset)) {
 		cp_ctx(ctx, 0x400b10, 0x1);
 		gr_def(ctx, 0x400b10, 0x0001629d);
 		cp_ctx(ctx, 0x400b20, 0x1);
 		gr_def(ctx, 0x400b20, 0x0001629d);
 	}
 
+	nv50_graph_construct_mmio_ddata(ctx);
+
 	/* 0C00: VFETCH */
 	cp_ctx(ctx, 0x400c08, 0x2);
 	gr_def(ctx, 0x400c08, 0x0000fe0c);
@@ -314,7 +323,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 	if (dev_priv->chipset < 0xa0) {
 		cp_ctx(ctx, 0x401008, 0x4);
 		gr_def(ctx, 0x401014, 0x00001000);
-	} else if (dev_priv->chipset == 0xa0 || dev_priv->chipset >= 0xaa) {
+	} else if (!IS_NVA3F(dev_priv->chipset)) {
 		cp_ctx(ctx, 0x401008, 0x5);
 		gr_def(ctx, 0x401018, 0x00001000);
 	} else {
@@ -368,10 +377,13 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 	case 0xa3:
 	case 0xa5:
 	case 0xa8:
+	case 0xaf:
 		gr_def(ctx, 0x401c00, 0x142500df);
 		break;
 	}
 
+	/* 2000 */
+
 	/* 2400 */
 	cp_ctx(ctx, 0x402400, 0x1);
 	if (dev_priv->chipset == 0x50)
@@ -380,12 +392,12 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 		cp_ctx(ctx, 0x402408, 0x2);
 	gr_def(ctx, 0x402408, 0x00000600);
 
-	/* 2800 */
+	/* 2800: CSCHED */
 	cp_ctx(ctx, 0x402800, 0x1);
 	if (dev_priv->chipset == 0x50)
 		gr_def(ctx, 0x402800, 0x00000006);
 
-	/* 2C00 */
+	/* 2C00: ZCULL */
 	cp_ctx(ctx, 0x402c08, 0x6);
 	if (dev_priv->chipset != 0x50)
 		gr_def(ctx, 0x402c14, 0x01000000);
@@ -396,23 +408,23 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 		cp_ctx(ctx, 0x402ca0, 0x2);
 	if (dev_priv->chipset < 0xa0)
 		gr_def(ctx, 0x402ca0, 0x00000400);
-	else if (dev_priv->chipset == 0xa0 || dev_priv->chipset >= 0xaa)
+	else if (!IS_NVA3F(dev_priv->chipset))
 		gr_def(ctx, 0x402ca0, 0x00000800);
 	else
 		gr_def(ctx, 0x402ca0, 0x00000400);
 	cp_ctx(ctx, 0x402cac, 0x4);
 
-	/* 3000 */
+	/* 3000: ENG2D */
 	cp_ctx(ctx, 0x403004, 0x1);
 	gr_def(ctx, 0x403004, 0x00000001);
 
-	/* 3404 */
+	/* 3400 */
 	if (dev_priv->chipset >= 0xa0) {
 		cp_ctx(ctx, 0x403404, 0x1);
 		gr_def(ctx, 0x403404, 0x00000001);
 	}
 
-	/* 5000 */
+	/* 5000: CCACHE */
 	cp_ctx(ctx, 0x405000, 0x1);
 	switch (dev_priv->chipset) {
 	case 0x50:
@@ -425,6 +437,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 	case 0xa8:
 	case 0xaa:
 	case 0xac:
+	case 0xaf:
 		gr_def(ctx, 0x405000, 0x000e0080);
 		break;
 	case 0x86:
@@ -441,210 +454,6 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 	cp_ctx(ctx, 0x405024, 0x1);
 	cp_ctx(ctx, 0x40502c, 0x1);
 
-	/* 5400 or maybe 4800 */
-	if (dev_priv->chipset == 0x50) {
-		offset = 0x405400;
-		cp_ctx(ctx, 0x405400, 0xea);
-	} else if (dev_priv->chipset < 0x94) {
-		offset = 0x405400;
-		cp_ctx(ctx, 0x405400, 0xcb);
-	} else if (dev_priv->chipset < 0xa0) {
-		offset = 0x405400;
-		cp_ctx(ctx, 0x405400, 0xcc);
-	} else if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		offset = 0x404800;
-		cp_ctx(ctx, 0x404800, 0xda);
-	} else {
-		offset = 0x405400;
-		cp_ctx(ctx, 0x405400, 0xd4);
-	}
-	gr_def(ctx, offset + 0x0c, 0x00000002);
-	gr_def(ctx, offset + 0x10, 0x00000001);
-	if (dev_priv->chipset >= 0x94)
-		offset += 4;
-	gr_def(ctx, offset + 0x1c, 0x00000001);
-	gr_def(ctx, offset + 0x20, 0x00000100);
-	gr_def(ctx, offset + 0x38, 0x00000002);
-	gr_def(ctx, offset + 0x3c, 0x00000001);
-	gr_def(ctx, offset + 0x40, 0x00000001);
-	gr_def(ctx, offset + 0x50, 0x00000001);
-	gr_def(ctx, offset + 0x54, 0x003fffff);
-	gr_def(ctx, offset + 0x58, 0x00001fff);
-	gr_def(ctx, offset + 0x60, 0x00000001);
-	gr_def(ctx, offset + 0x64, 0x00000001);
-	gr_def(ctx, offset + 0x6c, 0x00000001);
-	gr_def(ctx, offset + 0x70, 0x00000001);
-	gr_def(ctx, offset + 0x74, 0x00000001);
-	gr_def(ctx, offset + 0x78, 0x00000004);
-	gr_def(ctx, offset + 0x7c, 0x00000001);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		offset += 4;
-	gr_def(ctx, offset + 0x80, 0x00000001);
-	gr_def(ctx, offset + 0x84, 0x00000001);
-	gr_def(ctx, offset + 0x88, 0x00000007);
-	gr_def(ctx, offset + 0x8c, 0x00000001);
-	gr_def(ctx, offset + 0x90, 0x00000007);
-	gr_def(ctx, offset + 0x94, 0x00000001);
-	gr_def(ctx, offset + 0x98, 0x00000001);
-	gr_def(ctx, offset + 0x9c, 0x00000001);
-	if (dev_priv->chipset == 0x50) {
-		 gr_def(ctx, offset + 0xb0, 0x00000001);
-		 gr_def(ctx, offset + 0xb4, 0x00000001);
-		 gr_def(ctx, offset + 0xbc, 0x00000001);
-		 gr_def(ctx, offset + 0xc0, 0x0000000a);
-		 gr_def(ctx, offset + 0xd0, 0x00000040);
-		 gr_def(ctx, offset + 0xd8, 0x00000002);
-		 gr_def(ctx, offset + 0xdc, 0x00000100);
-		 gr_def(ctx, offset + 0xe0, 0x00000001);
-		 gr_def(ctx, offset + 0xe4, 0x00000100);
-		 gr_def(ctx, offset + 0x100, 0x00000001);
-		 gr_def(ctx, offset + 0x124, 0x00000004);
-		 gr_def(ctx, offset + 0x13c, 0x00000001);
-		 gr_def(ctx, offset + 0x140, 0x00000100);
-		 gr_def(ctx, offset + 0x148, 0x00000001);
-		 gr_def(ctx, offset + 0x154, 0x00000100);
-		 gr_def(ctx, offset + 0x158, 0x00000001);
-		 gr_def(ctx, offset + 0x15c, 0x00000100);
-		 gr_def(ctx, offset + 0x164, 0x00000001);
-		 gr_def(ctx, offset + 0x170, 0x00000100);
-		 gr_def(ctx, offset + 0x174, 0x00000001);
-		 gr_def(ctx, offset + 0x17c, 0x00000001);
-		 gr_def(ctx, offset + 0x188, 0x00000002);
-		 gr_def(ctx, offset + 0x190, 0x00000001);
-		 gr_def(ctx, offset + 0x198, 0x00000001);
-		 gr_def(ctx, offset + 0x1ac, 0x00000003);
-		 offset += 0xd0;
-	} else {
-		gr_def(ctx, offset + 0xb0, 0x00000001);
-		gr_def(ctx, offset + 0xb4, 0x00000100);
-		gr_def(ctx, offset + 0xbc, 0x00000001);
-		gr_def(ctx, offset + 0xc8, 0x00000100);
-		gr_def(ctx, offset + 0xcc, 0x00000001);
-		gr_def(ctx, offset + 0xd0, 0x00000100);
-		gr_def(ctx, offset + 0xd8, 0x00000001);
-		gr_def(ctx, offset + 0xe4, 0x00000100);
-	}
-	gr_def(ctx, offset + 0xf8, 0x00000004);
-	gr_def(ctx, offset + 0xfc, 0x00000070);
-	gr_def(ctx, offset + 0x100, 0x00000080);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		offset += 4;
-	gr_def(ctx, offset + 0x114, 0x0000000c);
-	if (dev_priv->chipset == 0x50)
-		offset -= 4;
-	gr_def(ctx, offset + 0x11c, 0x00000008);
-	gr_def(ctx, offset + 0x120, 0x00000014);
-	if (dev_priv->chipset == 0x50) {
-		gr_def(ctx, offset + 0x124, 0x00000026);
-		offset -= 0x18;
-	} else {
-		gr_def(ctx, offset + 0x128, 0x00000029);
-		gr_def(ctx, offset + 0x12c, 0x00000027);
-		gr_def(ctx, offset + 0x130, 0x00000026);
-		gr_def(ctx, offset + 0x134, 0x00000008);
-		gr_def(ctx, offset + 0x138, 0x00000004);
-		gr_def(ctx, offset + 0x13c, 0x00000027);
-	}
-	gr_def(ctx, offset + 0x148, 0x00000001);
-	gr_def(ctx, offset + 0x14c, 0x00000002);
-	gr_def(ctx, offset + 0x150, 0x00000003);
-	gr_def(ctx, offset + 0x154, 0x00000004);
-	gr_def(ctx, offset + 0x158, 0x00000005);
-	gr_def(ctx, offset + 0x15c, 0x00000006);
-	gr_def(ctx, offset + 0x160, 0x00000007);
-	gr_def(ctx, offset + 0x164, 0x00000001);
-	gr_def(ctx, offset + 0x1a8, 0x000000cf);
-	if (dev_priv->chipset == 0x50)
-		offset -= 4;
-	gr_def(ctx, offset + 0x1d8, 0x00000080);
-	gr_def(ctx, offset + 0x1dc, 0x00000004);
-	gr_def(ctx, offset + 0x1e0, 0x00000004);
-	if (dev_priv->chipset == 0x50)
-		offset -= 4;
-	else
-		gr_def(ctx, offset + 0x1e4, 0x00000003);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		gr_def(ctx, offset + 0x1ec, 0x00000003);
-		offset += 8;
-	}
-	gr_def(ctx, offset + 0x1e8, 0x00000001);
-	if (dev_priv->chipset == 0x50)
-		offset -= 4;
-	gr_def(ctx, offset + 0x1f4, 0x00000012);
-	gr_def(ctx, offset + 0x1f8, 0x00000010);
-	gr_def(ctx, offset + 0x1fc, 0x0000000c);
-	gr_def(ctx, offset + 0x200, 0x00000001);
-	gr_def(ctx, offset + 0x210, 0x00000004);
-	gr_def(ctx, offset + 0x214, 0x00000002);
-	gr_def(ctx, offset + 0x218, 0x00000004);
-	if (dev_priv->chipset >= 0xa0)
-		offset += 4;
-	gr_def(ctx, offset + 0x224, 0x003fffff);
-	gr_def(ctx, offset + 0x228, 0x00001fff);
-	if (dev_priv->chipset == 0x50)
-		offset -= 0x20;
-	else if (dev_priv->chipset >= 0xa0) {
-		gr_def(ctx, offset + 0x250, 0x00000001);
-		gr_def(ctx, offset + 0x254, 0x00000001);
-		gr_def(ctx, offset + 0x258, 0x00000002);
-		offset += 0x10;
-	}
-	gr_def(ctx, offset + 0x250, 0x00000004);
-	gr_def(ctx, offset + 0x254, 0x00000014);
-	gr_def(ctx, offset + 0x258, 0x00000001);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		offset += 4;
-	gr_def(ctx, offset + 0x264, 0x00000002);
-	if (dev_priv->chipset >= 0xa0)
-		offset += 8;
-	gr_def(ctx, offset + 0x270, 0x00000001);
-	gr_def(ctx, offset + 0x278, 0x00000002);
-	gr_def(ctx, offset + 0x27c, 0x00001000);
-	if (dev_priv->chipset == 0x50)
-		offset -= 0xc;
-	else {
-		gr_def(ctx, offset + 0x280, 0x00000e00);
-		gr_def(ctx, offset + 0x284, 0x00001000);
-		gr_def(ctx, offset + 0x288, 0x00001e00);
-	}
-	gr_def(ctx, offset + 0x290, 0x00000001);
-	gr_def(ctx, offset + 0x294, 0x00000001);
-	gr_def(ctx, offset + 0x298, 0x00000001);
-	gr_def(ctx, offset + 0x29c, 0x00000001);
-	gr_def(ctx, offset + 0x2a0, 0x00000001);
-	gr_def(ctx, offset + 0x2b0, 0x00000200);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		gr_def(ctx, offset + 0x2b4, 0x00000200);
-		offset += 4;
-	}
-	if (dev_priv->chipset < 0xa0) {
-		gr_def(ctx, offset + 0x2b8, 0x00000001);
-		gr_def(ctx, offset + 0x2bc, 0x00000070);
-		gr_def(ctx, offset + 0x2c0, 0x00000080);
-		gr_def(ctx, offset + 0x2cc, 0x00000001);
-		gr_def(ctx, offset + 0x2d0, 0x00000070);
-		gr_def(ctx, offset + 0x2d4, 0x00000080);
-	} else {
-		gr_def(ctx, offset + 0x2b8, 0x00000001);
-		gr_def(ctx, offset + 0x2bc, 0x000000f0);
-		gr_def(ctx, offset + 0x2c0, 0x000000ff);
-		gr_def(ctx, offset + 0x2cc, 0x00000001);
-		gr_def(ctx, offset + 0x2d0, 0x000000f0);
-		gr_def(ctx, offset + 0x2d4, 0x000000ff);
-		gr_def(ctx, offset + 0x2dc, 0x00000009);
-		offset += 4;
-	}
-	gr_def(ctx, offset + 0x2e4, 0x00000001);
-	gr_def(ctx, offset + 0x2e8, 0x000000cf);
-	gr_def(ctx, offset + 0x2f0, 0x00000001);
-	gr_def(ctx, offset + 0x300, 0x000000cf);
-	gr_def(ctx, offset + 0x308, 0x00000002);
-	gr_def(ctx, offset + 0x310, 0x00000001);
-	gr_def(ctx, offset + 0x318, 0x00000001);
-	gr_def(ctx, offset + 0x320, 0x000000cf);
-	gr_def(ctx, offset + 0x324, 0x000000cf);
-	gr_def(ctx, offset + 0x328, 0x00000001);
-
 	/* 6000? */
 	if (dev_priv->chipset == 0x50)
 		cp_ctx(ctx, 0x4063e0, 0x1);
@@ -661,7 +470,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 			gr_def(ctx, 0x406818, 0x00000f80);
 		else
 			gr_def(ctx, 0x406818, 0x00001f80);
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
+		if (IS_NVA3F(dev_priv->chipset))
 			gr_def(ctx, 0x40681c, 0x00000030);
 		cp_ctx(ctx, 0x406830, 0x3);
 	}
@@ -706,7 +515,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 
 			if (dev_priv->chipset < 0xa0)
 				cp_ctx(ctx, 0x407094 + (i<<8), 1);
-			else if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa)
+			else if (!IS_NVA3F(dev_priv->chipset))
 				cp_ctx(ctx, 0x407094 + (i<<8), 3);
 			else {
 				cp_ctx(ctx, 0x407094 + (i<<8), 4);
@@ -799,6 +608,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 				case 0xa8:
 				case 0xaa:
 				case 0xac:
+				case 0xaf:
 					gr_def(ctx, offset + 0x1c, 0x300c0000);
 					break;
 				}
@@ -825,7 +635,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 				gr_def(ctx, base + 0x304, 0x00007070);
 			else if (dev_priv->chipset < 0xa0)
 				gr_def(ctx, base + 0x304, 0x00027070);
-			else if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa)
+			else if (!IS_NVA3F(dev_priv->chipset))
 				gr_def(ctx, base + 0x304, 0x01127070);
 			else
 				gr_def(ctx, base + 0x304, 0x05127070);
@@ -849,7 +659,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 			if (dev_priv->chipset < 0xa0) {
 				cp_ctx(ctx, base + 0x340, 9);
 				offset = base + 0x340;
-			} else if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa) {
+			} else if (!IS_NVA3F(dev_priv->chipset)) {
 				cp_ctx(ctx, base + 0x33c, 0xb);
 				offset = base + 0x344;
 			} else {
@@ -880,7 +690,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 			gr_def(ctx, offset + 0x0, 0x000001f0);
 			gr_def(ctx, offset + 0x4, 0x00000001);
 			gr_def(ctx, offset + 0x8, 0x00000003);
-			if (dev_priv->chipset == 0x50 || dev_priv->chipset >= 0xaa)
+			if (dev_priv->chipset == 0x50 || IS_NVAAF(dev_priv->chipset))
 				gr_def(ctx, offset + 0xc, 0x00008000);
 			gr_def(ctx, offset + 0x14, 0x00039e00);
 			cp_ctx(ctx, offset + 0x1c, 2);
@@ -892,7 +702,7 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 
 			if (dev_priv->chipset >= 0xa0) {
 				cp_ctx(ctx, base + 0x54c, 2);
-				if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa)
+				if (!IS_NVA3F(dev_priv->chipset))
 					gr_def(ctx, base + 0x54c, 0x003fe006);
 				else
 					gr_def(ctx, base + 0x54c, 0x003fe007);
@@ -948,6 +758,336 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
 	}
 }
 
+static void
+dd_emit(struct nouveau_grctx *ctx, int num, uint32_t val) {
+	int i;
+	if (val && ctx->mode == NOUVEAU_GRCTX_VALS)
+		for (i = 0; i < num; i++)
+			nv_wo32(ctx->data, 4 * (ctx->ctxvals_pos + i), val);
+	ctx->ctxvals_pos += num;
+}
+
+static void
+nv50_graph_construct_mmio_ddata(struct nouveau_grctx *ctx)
+{
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	int base, num;
+	base = ctx->ctxvals_pos;
+
+	/* tesla state */
+	dd_emit(ctx, 1, 0);	/* 00000001 UNK0F90 */
+	dd_emit(ctx, 1, 0);	/* 00000001 UNK135C */
+
+	/* SRC_TIC state */
+	dd_emit(ctx, 1, 0);	/* 00000007 SRC_TILE_MODE_Z */
+	dd_emit(ctx, 1, 2);	/* 00000007 SRC_TILE_MODE_Y */
+	dd_emit(ctx, 1, 1);	/* 00000001 SRC_LINEAR #1 */
+	dd_emit(ctx, 1, 0);	/* 000000ff SRC_ADDRESS_HIGH */
+	dd_emit(ctx, 1, 0);	/* 00000001 SRC_SRGB */
+	if (dev_priv->chipset >= 0x94)
+		dd_emit(ctx, 1, 0);	/* 00000003 eng2d UNK0258 */
+	dd_emit(ctx, 1, 1);	/* 00000fff SRC_DEPTH */
+	dd_emit(ctx, 1, 0x100);	/* 0000ffff SRC_HEIGHT */
+
+	/* turing state */
+	dd_emit(ctx, 1, 0);		/* 0000000f TEXTURES_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 0000000f SAMPLERS_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 000000ff CB_DEF_ADDRESS_HIGH */
+	dd_emit(ctx, 1, 0);		/* ffffffff CB_DEF_ADDRESS_LOW */
+	dd_emit(ctx, 1, 0);		/* ffffffff SHARED_SIZE */
+	dd_emit(ctx, 1, 2);		/* ffffffff REG_MODE */
+	dd_emit(ctx, 1, 1);		/* 0000ffff BLOCK_ALLOC_THREADS */
+	dd_emit(ctx, 1, 1);		/* 00000001 LANES32 */
+	dd_emit(ctx, 1, 0);		/* 000000ff UNK370 */
+	dd_emit(ctx, 1, 0);		/* 000000ff USER_PARAM_UNK */
+	dd_emit(ctx, 1, 0);		/* 000000ff USER_PARAM_COUNT */
+	dd_emit(ctx, 1, 1);		/* 000000ff UNK384 bits 8-15 */
+	dd_emit(ctx, 1, 0x3fffff);	/* 003fffff TIC_LIMIT */
+	dd_emit(ctx, 1, 0x1fff);	/* 000fffff TSC_LIMIT */
+	dd_emit(ctx, 1, 0);		/* 0000ffff CB_ADDR_INDEX */
+	dd_emit(ctx, 1, 1);		/* 000007ff BLOCKDIM_X */
+	dd_emit(ctx, 1, 1);		/* 000007ff BLOCKDIM_XMY */
+	dd_emit(ctx, 1, 0);		/* 00000001 BLOCKDIM_XMY_OVERFLOW */
+	dd_emit(ctx, 1, 1);		/* 0003ffff BLOCKDIM_XMYMZ */
+	dd_emit(ctx, 1, 1);		/* 000007ff BLOCKDIM_Y */
+	dd_emit(ctx, 1, 1);		/* 0000007f BLOCKDIM_Z */
+	dd_emit(ctx, 1, 4);		/* 000000ff CP_REG_ALLOC_TEMP */
+	dd_emit(ctx, 1, 1);		/* 00000001 BLOCKDIM_DIRTY */
+	if (IS_NVA3F(dev_priv->chipset))
+		dd_emit(ctx, 1, 0);	/* 00000003 UNK03E8 */
+	dd_emit(ctx, 1, 1);		/* 0000007f BLOCK_ALLOC_HALFWARPS */
+	dd_emit(ctx, 1, 1);		/* 00000007 LOCAL_WARPS_NO_CLAMP */
+	dd_emit(ctx, 1, 7);		/* 00000007 LOCAL_WARPS_LOG_ALLOC */
+	dd_emit(ctx, 1, 1);		/* 00000007 STACK_WARPS_NO_CLAMP */
+	dd_emit(ctx, 1, 7);		/* 00000007 STACK_WARPS_LOG_ALLOC */
+	dd_emit(ctx, 1, 1);		/* 00001fff BLOCK_ALLOC_REGSLOTS_PACKED */
+	dd_emit(ctx, 1, 1);		/* 00001fff BLOCK_ALLOC_REGSLOTS_STRIDED */
+	dd_emit(ctx, 1, 1);		/* 000007ff BLOCK_ALLOC_THREADS */
+
+	/* compat 2d state */
+	if (dev_priv->chipset == 0x50) {
+		dd_emit(ctx, 4, 0);		/* 0000ffff clip X, Y, W, H */
+
+		dd_emit(ctx, 1, 1);		/* ffffffff chroma COLOR_FORMAT */
+
+		dd_emit(ctx, 1, 1);		/* ffffffff pattern COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff pattern SHAPE */
+		dd_emit(ctx, 1, 1);		/* ffffffff pattern PATTERN_SELECT */
+
+		dd_emit(ctx, 1, 0xa);		/* ffffffff surf2d SRC_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff surf2d DMA_SRC */
+		dd_emit(ctx, 1, 0);		/* 000000ff surf2d SRC_ADDRESS_HIGH */
+		dd_emit(ctx, 1, 0);		/* ffffffff surf2d SRC_ADDRESS_LOW */
+		dd_emit(ctx, 1, 0x40);		/* 0000ffff surf2d SRC_PITCH */
+		dd_emit(ctx, 1, 0);		/* 0000000f surf2d SRC_TILE_MODE_Z */
+		dd_emit(ctx, 1, 2);		/* 0000000f surf2d SRC_TILE_MODE_Y */
+		dd_emit(ctx, 1, 0x100);		/* ffffffff surf2d SRC_HEIGHT */
+		dd_emit(ctx, 1, 1);		/* 00000001 surf2d SRC_LINEAR */
+		dd_emit(ctx, 1, 0x100);		/* ffffffff surf2d SRC_WIDTH */
+
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_B_X */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_B_Y */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_C_X */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_C_Y */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_D_X */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect CLIP_D_Y */
+		dd_emit(ctx, 1, 1);		/* ffffffff gdirect COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff gdirect OPERATION */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect POINT_X */
+		dd_emit(ctx, 1, 0);		/* 0000ffff gdirect POINT_Y */
+
+		dd_emit(ctx, 1, 0);		/* 0000ffff blit SRC_Y */
+		dd_emit(ctx, 1, 0);		/* ffffffff blit OPERATION */
+
+		dd_emit(ctx, 1, 0);		/* ffffffff ifc OPERATION */
+
+		dd_emit(ctx, 1, 0);		/* ffffffff iifc INDEX_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff iifc LUT_OFFSET */
+		dd_emit(ctx, 1, 4);		/* ffffffff iifc COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff iifc OPERATION */
+	}
+
+	/* m2mf state */
+	dd_emit(ctx, 1, 0);		/* ffffffff m2mf LINE_COUNT */
+	dd_emit(ctx, 1, 0);		/* ffffffff m2mf LINE_LENGTH_IN */
+	dd_emit(ctx, 2, 0);		/* ffffffff m2mf OFFSET_IN, OFFSET_OUT */
+	dd_emit(ctx, 1, 1);		/* ffffffff m2mf TILING_DEPTH_OUT */
+	dd_emit(ctx, 1, 0x100);		/* ffffffff m2mf TILING_HEIGHT_OUT */
+	dd_emit(ctx, 1, 0);		/* ffffffff m2mf TILING_POSITION_OUT_Z */
+	dd_emit(ctx, 1, 1);		/* 00000001 m2mf LINEAR_OUT */
+	dd_emit(ctx, 2, 0);		/* 0000ffff m2mf TILING_POSITION_OUT_X, Y */
+	dd_emit(ctx, 1, 0x100);		/* ffffffff m2mf TILING_PITCH_OUT */
+	dd_emit(ctx, 1, 1);		/* ffffffff m2mf TILING_DEPTH_IN */
+	dd_emit(ctx, 1, 0x100);		/* ffffffff m2mf TILING_HEIGHT_IN */
+	dd_emit(ctx, 1, 0);		/* ffffffff m2mf TILING_POSITION_IN_Z */
+	dd_emit(ctx, 1, 1);		/* 00000001 m2mf LINEAR_IN */
+	dd_emit(ctx, 2, 0);		/* 0000ffff m2mf TILING_POSITION_IN_X, Y */
+	dd_emit(ctx, 1, 0x100);		/* ffffffff m2mf TILING_PITCH_IN */
+
+	/* more compat 2d state */
+	if (dev_priv->chipset == 0x50) {
+		dd_emit(ctx, 1, 1);		/* ffffffff line COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff line OPERATION */
+
+		dd_emit(ctx, 1, 1);		/* ffffffff triangle COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff triangle OPERATION */
+
+		dd_emit(ctx, 1, 0);		/* 0000000f sifm TILE_MODE_Z */
+		dd_emit(ctx, 1, 2);		/* 0000000f sifm TILE_MODE_Y */
+		dd_emit(ctx, 1, 0);		/* 000000ff sifm FORMAT_FILTER */
+		dd_emit(ctx, 1, 1);		/* 000000ff sifm FORMAT_ORIGIN */
+		dd_emit(ctx, 1, 0);		/* 0000ffff sifm SRC_PITCH */
+		dd_emit(ctx, 1, 1);		/* 00000001 sifm SRC_LINEAR */
+		dd_emit(ctx, 1, 0);		/* 000000ff sifm SRC_OFFSET_HIGH */
+		dd_emit(ctx, 1, 0);		/* ffffffff sifm SRC_OFFSET */
+		dd_emit(ctx, 1, 0);		/* 0000ffff sifm SRC_HEIGHT */
+		dd_emit(ctx, 1, 0);		/* 0000ffff sifm SRC_WIDTH */
+		dd_emit(ctx, 1, 3);		/* ffffffff sifm COLOR_FORMAT */
+		dd_emit(ctx, 1, 0);		/* ffffffff sifm OPERATION */
+
+		dd_emit(ctx, 1, 0);		/* ffffffff sifc OPERATION */
+	}
+
+	/* tesla state */
+	dd_emit(ctx, 1, 0);		/* 0000000f GP_TEXTURES_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 0000000f GP_SAMPLERS_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 000000ff */
+	dd_emit(ctx, 1, 0);		/* ffffffff */
+	dd_emit(ctx, 1, 4);		/* 000000ff UNK12B0_0 */
+	dd_emit(ctx, 1, 0x70);		/* 000000ff UNK12B0_1 */
+	dd_emit(ctx, 1, 0x80);		/* 000000ff UNK12B0_3 */
+	dd_emit(ctx, 1, 0);		/* 000000ff UNK12B0_2 */
+	dd_emit(ctx, 1, 0);		/* 0000000f FP_TEXTURES_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 0000000f FP_SAMPLERS_LOG2 */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		dd_emit(ctx, 1, 0);	/* ffffffff */
+		dd_emit(ctx, 1, 0);	/* 0000007f MULTISAMPLE_SAMPLES_LOG2 */
+	} else {
+		dd_emit(ctx, 1, 0);	/* 0000000f MULTISAMPLE_SAMPLES_LOG2 */
+	} 
+	dd_emit(ctx, 1, 0xc);		/* 000000ff SEMANTIC_COLOR.BFC0_ID */
+	if (dev_priv->chipset != 0x50)
+		dd_emit(ctx, 1, 0);	/* 00000001 SEMANTIC_COLOR.CLMP_EN */
+	dd_emit(ctx, 1, 8);		/* 000000ff SEMANTIC_COLOR.COLR_NR */
+	dd_emit(ctx, 1, 0x14);		/* 000000ff SEMANTIC_COLOR.FFC0_ID */
+	if (dev_priv->chipset == 0x50) {
+		dd_emit(ctx, 1, 0);	/* 000000ff SEMANTIC_LAYER */
+		dd_emit(ctx, 1, 0);	/* 00000001 */
+	} else {
+		dd_emit(ctx, 1, 0);	/* 00000001 SEMANTIC_PTSZ.ENABLE */
+		dd_emit(ctx, 1, 0x29);	/* 000000ff SEMANTIC_PTSZ.PTSZ_ID */
+		dd_emit(ctx, 1, 0x27);	/* 000000ff SEMANTIC_PRIM */
+		dd_emit(ctx, 1, 0x26);	/* 000000ff SEMANTIC_LAYER */
+		dd_emit(ctx, 1, 8);	/* 0000000f SMENATIC_CLIP.CLIP_HIGH */
+		dd_emit(ctx, 1, 4);	/* 000000ff SEMANTIC_CLIP.CLIP_LO */
+		dd_emit(ctx, 1, 0x27);	/* 000000ff UNK0FD4 */
+		dd_emit(ctx, 1, 0);	/* 00000001 UNK1900 */
+	}
+	dd_emit(ctx, 1, 0);		/* 00000007 RT_CONTROL_MAP0 */
+	dd_emit(ctx, 1, 1);		/* 00000007 RT_CONTROL_MAP1 */
+	dd_emit(ctx, 1, 2);		/* 00000007 RT_CONTROL_MAP2 */
+	dd_emit(ctx, 1, 3);		/* 00000007 RT_CONTROL_MAP3 */
+	dd_emit(ctx, 1, 4);		/* 00000007 RT_CONTROL_MAP4 */
+	dd_emit(ctx, 1, 5);		/* 00000007 RT_CONTROL_MAP5 */
+	dd_emit(ctx, 1, 6);		/* 00000007 RT_CONTROL_MAP6 */
+	dd_emit(ctx, 1, 7);		/* 00000007 RT_CONTROL_MAP7 */
+	dd_emit(ctx, 1, 1);		/* 0000000f RT_CONTROL_COUNT */
+	dd_emit(ctx, 8, 0);		/* 00000001 RT_HORIZ_UNK */
+	dd_emit(ctx, 8, 0);		/* ffffffff RT_ADDRESS_LOW */
+	dd_emit(ctx, 1, 0xcf);		/* 000000ff RT_FORMAT */
+	dd_emit(ctx, 7, 0);		/* 000000ff RT_FORMAT */
+	if (dev_priv->chipset != 0x50)
+		dd_emit(ctx, 3, 0);	/* 1, 1, 1 */
+	else
+		dd_emit(ctx, 2, 0);	/* 1, 1 */
+	dd_emit(ctx, 1, 0);		/* ffffffff GP_ENABLE */
+	dd_emit(ctx, 1, 0x80);		/* 0000ffff GP_VERTEX_OUTPUT_COUNT*/
+	dd_emit(ctx, 1, 4);		/* 000000ff GP_REG_ALLOC_RESULT */
+	dd_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		dd_emit(ctx, 1, 3);	/* 00000003 */
+		dd_emit(ctx, 1, 0);	/* 00000001 UNK1418. Alone. */
+	}
+	if (dev_priv->chipset != 0x50)
+		dd_emit(ctx, 1, 3);	/* 00000003 UNK15AC */
+	dd_emit(ctx, 1, 1);		/* ffffffff RASTERIZE_ENABLE */
+	dd_emit(ctx, 1, 0);		/* 00000001 FP_CONTROL.EXPORTS_Z */
+	if (dev_priv->chipset != 0x50)
+		dd_emit(ctx, 1, 0);	/* 00000001 FP_CONTROL.MULTIPLE_RESULTS */
+	dd_emit(ctx, 1, 0x12);		/* 000000ff FP_INTERPOLANT_CTRL.COUNT */
+	dd_emit(ctx, 1, 0x10);		/* 000000ff FP_INTERPOLANT_CTRL.COUNT_NONFLAT */
+	dd_emit(ctx, 1, 0xc);		/* 000000ff FP_INTERPOLANT_CTRL.OFFSET */
+	dd_emit(ctx, 1, 1);		/* 00000001 FP_INTERPOLANT_CTRL.UMASK.W */
+	dd_emit(ctx, 1, 0);		/* 00000001 FP_INTERPOLANT_CTRL.UMASK.X */
+	dd_emit(ctx, 1, 0);		/* 00000001 FP_INTERPOLANT_CTRL.UMASK.Y */
+	dd_emit(ctx, 1, 0);		/* 00000001 FP_INTERPOLANT_CTRL.UMASK.Z */
+	dd_emit(ctx, 1, 4);		/* 000000ff FP_RESULT_COUNT */
+	dd_emit(ctx, 1, 2);		/* ffffffff REG_MODE */
+	dd_emit(ctx, 1, 4);		/* 000000ff FP_REG_ALLOC_TEMP */
+	if (dev_priv->chipset >= 0xa0)
+		dd_emit(ctx, 1, 0);	/* ffffffff */
+	dd_emit(ctx, 1, 0);		/* 00000001 GP_BUILTIN_RESULT_EN.LAYER_IDX */
+	dd_emit(ctx, 1, 0);		/* ffffffff STRMOUT_ENABLE */
+	dd_emit(ctx, 1, 0x3fffff);	/* 003fffff TIC_LIMIT */
+	dd_emit(ctx, 1, 0x1fff);	/* 000fffff TSC_LIMIT */
+	dd_emit(ctx, 1, 0);		/* 00000001 VERTEX_TWO_SIDE_ENABLE*/
+	if (dev_priv->chipset != 0x50)
+		dd_emit(ctx, 8, 0);	/* 00000001 */
+	if (dev_priv->chipset >= 0xa0) {
+		dd_emit(ctx, 1, 1);	/* 00000007 VTX_ATTR_DEFINE.COMP */
+		dd_emit(ctx, 1, 1);	/* 00000007 VTX_ATTR_DEFINE.SIZE */
+		dd_emit(ctx, 1, 2);	/* 00000007 VTX_ATTR_DEFINE.TYPE */
+		dd_emit(ctx, 1, 0);	/* 000000ff VTX_ATTR_DEFINE.ATTR */
+	}
+	dd_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	dd_emit(ctx, 1, 0x14);		/* 0000001f ZETA_FORMAT */
+	dd_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	dd_emit(ctx, 1, 0);		/* 0000000f VP_TEXTURES_LOG2 */
+	dd_emit(ctx, 1, 0);		/* 0000000f VP_SAMPLERS_LOG2 */
+	if (IS_NVA3F(dev_priv->chipset))
+		dd_emit(ctx, 1, 0);	/* 00000001 */
+	dd_emit(ctx, 1, 2);		/* 00000003 POLYGON_MODE_BACK */
+	if (dev_priv->chipset >= 0xa0)
+		dd_emit(ctx, 1, 0);	/* 00000003 VTX_ATTR_DEFINE.SIZE - 1 */
+	dd_emit(ctx, 1, 0);		/* 0000ffff CB_ADDR_INDEX */
+	if (dev_priv->chipset >= 0xa0)
+		dd_emit(ctx, 1, 0);	/* 00000003 */
+	dd_emit(ctx, 1, 0);		/* 00000001 CULL_FACE_ENABLE */
+	dd_emit(ctx, 1, 1);		/* 00000003 CULL_FACE */
+	dd_emit(ctx, 1, 0);		/* 00000001 FRONT_FACE */
+	dd_emit(ctx, 1, 2);		/* 00000003 POLYGON_MODE_FRONT */
+	dd_emit(ctx, 1, 0x1000);	/* 00007fff UNK141C */
+	if (dev_priv->chipset != 0x50) {
+		dd_emit(ctx, 1, 0xe00);		/* 7fff */
+		dd_emit(ctx, 1, 0x1000);	/* 7fff */
+		dd_emit(ctx, 1, 0x1e00);	/* 7fff */
+	}
+	dd_emit(ctx, 1, 0);		/* 00000001 BEGIN_END_ACTIVE */
+	dd_emit(ctx, 1, 1);		/* 00000001 POLYGON_MODE_??? */
+	dd_emit(ctx, 1, 1);		/* 000000ff GP_REG_ALLOC_TEMP / 4 rounded up */
+	dd_emit(ctx, 1, 1);		/* 000000ff FP_REG_ALLOC_TEMP... without /4? */
+	dd_emit(ctx, 1, 1);		/* 000000ff VP_REG_ALLOC_TEMP / 4 rounded up */
+	dd_emit(ctx, 1, 1);		/* 00000001 */
+	dd_emit(ctx, 1, 0);		/* 00000001 */
+	dd_emit(ctx, 1, 0);		/* 00000001 VTX_ATTR_MASK_UNK0 nonempty */
+	dd_emit(ctx, 1, 0);		/* 00000001 VTX_ATTR_MASK_UNK1 nonempty */
+	dd_emit(ctx, 1, 0x200);		/* 0003ffff GP_VERTEX_OUTPUT_COUNT*GP_REG_ALLOC_RESULT */
+	if (IS_NVA3F(dev_priv->chipset))
+		dd_emit(ctx, 1, 0x200);
+	dd_emit(ctx, 1, 0);		/* 00000001 */
+	if (dev_priv->chipset < 0xa0) {
+		dd_emit(ctx, 1, 1);	/* 00000001 */
+		dd_emit(ctx, 1, 0x70);	/* 000000ff */
+		dd_emit(ctx, 1, 0x80);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 00000001 */
+		dd_emit(ctx, 1, 1);	/* 00000001 */
+		dd_emit(ctx, 1, 0x70);	/* 000000ff */
+		dd_emit(ctx, 1, 0x80);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 000000ff */
+	} else {
+		dd_emit(ctx, 1, 1);	/* 00000001 */
+		dd_emit(ctx, 1, 0xf0);	/* 000000ff */
+		dd_emit(ctx, 1, 0xff);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 00000001 */
+		dd_emit(ctx, 1, 1);	/* 00000001 */
+		dd_emit(ctx, 1, 0xf0);	/* 000000ff */
+		dd_emit(ctx, 1, 0xff);	/* 000000ff */
+		dd_emit(ctx, 1, 0);	/* 000000ff */
+		dd_emit(ctx, 1, 9);	/* 0000003f UNK114C.COMP,SIZE */
+	}
+
+	/* eng2d state */
+	dd_emit(ctx, 1, 0);		/* 00000001 eng2d COLOR_KEY_ENABLE */
+	dd_emit(ctx, 1, 0);		/* 00000007 eng2d COLOR_KEY_FORMAT */
+	dd_emit(ctx, 1, 1);		/* ffffffff eng2d DST_DEPTH */
+	dd_emit(ctx, 1, 0xcf);		/* 000000ff eng2d DST_FORMAT */
+	dd_emit(ctx, 1, 0);		/* ffffffff eng2d DST_LAYER */
+	dd_emit(ctx, 1, 1);		/* 00000001 eng2d DST_LINEAR */
+	dd_emit(ctx, 1, 0);		/* 00000007 eng2d PATTERN_COLOR_FORMAT */
+	dd_emit(ctx, 1, 0);		/* 00000007 eng2d OPERATION */
+	dd_emit(ctx, 1, 0);		/* 00000003 eng2d PATTERN_SELECT */
+	dd_emit(ctx, 1, 0xcf);		/* 000000ff eng2d SIFC_FORMAT */
+	dd_emit(ctx, 1, 0);		/* 00000001 eng2d SIFC_BITMAP_ENABLE */
+	dd_emit(ctx, 1, 2);		/* 00000003 eng2d SIFC_BITMAP_UNK808 */
+	dd_emit(ctx, 1, 0);		/* ffffffff eng2d BLIT_DU_DX_FRACT */
+	dd_emit(ctx, 1, 1);		/* ffffffff eng2d BLIT_DU_DX_INT */
+	dd_emit(ctx, 1, 0);		/* ffffffff eng2d BLIT_DV_DY_FRACT */
+	dd_emit(ctx, 1, 1);		/* ffffffff eng2d BLIT_DV_DY_INT */
+	dd_emit(ctx, 1, 0);		/* 00000001 eng2d BLIT_CONTROL_FILTER */
+	dd_emit(ctx, 1, 0xcf);		/* 000000ff eng2d DRAW_COLOR_FORMAT */
+	dd_emit(ctx, 1, 0xcf);		/* 000000ff eng2d SRC_FORMAT */
+	dd_emit(ctx, 1, 1);		/* 00000001 eng2d SRC_LINEAR #2 */
+
+	num = ctx->ctxvals_pos - base;
+	ctx->ctxvals_pos = base;
+	if (IS_NVA3F(dev_priv->chipset))
+		cp_ctx(ctx, 0x404800, num);
+	else
+		cp_ctx(ctx, 0x405400, num);
+}
+
 /*
  * xfer areas. These are a pain.
  *
@@ -990,28 +1130,33 @@ nv50_graph_construct_mmio(struct nouveau_grctx *ctx)
  * without the help of ctxprog.
  */
 
-static inline void
+static void
 xf_emit(struct nouveau_grctx *ctx, int num, uint32_t val) {
 	int i;
 	if (val && ctx->mode == NOUVEAU_GRCTX_VALS)
 		for (i = 0; i < num; i++)
-			nv_wo32(ctx->dev, ctx->data, ctx->ctxvals_pos + (i << 3), val);
+			nv_wo32(ctx->data, 4 * (ctx->ctxvals_pos + (i << 3)), val);
 	ctx->ctxvals_pos += num << 3;
 }
 
 /* Gene declarations... */
 
+static void nv50_graph_construct_gene_dispatch(struct nouveau_grctx *ctx);
 static void nv50_graph_construct_gene_m2mf(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk1(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk2(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk3(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk4(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk5(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk6(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk7(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk8(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk9(struct nouveau_grctx *ctx);
-static void nv50_graph_construct_gene_unk10(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_ccache(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_unk10xx(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_unk14xx(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_zcull(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_clipid(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_unk24xx(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_vfetch(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_eng2d(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_csched(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_unk1cxx(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_strmout(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_unk34xx(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_ropm1(struct nouveau_grctx *ctx);
+static void nv50_graph_construct_gene_ropm2(struct nouveau_grctx *ctx);
 static void nv50_graph_construct_gene_ropc(struct nouveau_grctx *ctx);
 static void nv50_graph_construct_xfer_tp(struct nouveau_grctx *ctx);
 
@@ -1030,102 +1175,32 @@ nv50_graph_construct_xfer1(struct nouveau_grctx *ctx)
 	if (dev_priv->chipset < 0xa0) {
 		/* Strand 0 */
 		ctx->ctxvals_pos = offset;
-		switch (dev_priv->chipset) {
-		case 0x50:
-			xf_emit(ctx, 0x99, 0);
-			break;
-		case 0x84:
-		case 0x86:
-			xf_emit(ctx, 0x384, 0);
-			break;
-		case 0x92:
-		case 0x94:
-		case 0x96:
-		case 0x98:
-			xf_emit(ctx, 0x380, 0);
-			break;
-		}
-		nv50_graph_construct_gene_m2mf (ctx);
-		switch (dev_priv->chipset) {
-		case 0x50:
-		case 0x84:
-		case 0x86:
-		case 0x98:
-			xf_emit(ctx, 0x4c4, 0);
-			break;
-		case 0x92:
-		case 0x94:
-		case 0x96:
-			xf_emit(ctx, 0x984, 0);
-			break;
-		}
-		nv50_graph_construct_gene_unk5(ctx);
-		if (dev_priv->chipset == 0x50)
-			xf_emit(ctx, 0xa, 0);
-		else
-			xf_emit(ctx, 0xb, 0);
-		nv50_graph_construct_gene_unk4(ctx);
-		nv50_graph_construct_gene_unk3(ctx);
+		nv50_graph_construct_gene_dispatch(ctx);
+		nv50_graph_construct_gene_m2mf(ctx);
+		nv50_graph_construct_gene_unk24xx(ctx);
+		nv50_graph_construct_gene_clipid(ctx);
+		nv50_graph_construct_gene_zcull(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 1 */
 		ctx->ctxvals_pos = offset + 0x1;
-		nv50_graph_construct_gene_unk6(ctx);
-		nv50_graph_construct_gene_unk7(ctx);
-		nv50_graph_construct_gene_unk8(ctx);
-		switch (dev_priv->chipset) {
-		case 0x50:
-		case 0x92:
-			xf_emit(ctx, 0xfb, 0);
-			break;
-		case 0x84:
-			xf_emit(ctx, 0xd3, 0);
-			break;
-		case 0x94:
-		case 0x96:
-			xf_emit(ctx, 0xab, 0);
-			break;
-		case 0x86:
-		case 0x98:
-			xf_emit(ctx, 0x6b, 0);
-			break;
-		}
-		xf_emit(ctx, 2, 0x4e3bfdf);
-		xf_emit(ctx, 4, 0);
-		xf_emit(ctx, 1, 0x0fac6881);
-		xf_emit(ctx, 0xb, 0);
-		xf_emit(ctx, 2, 0x4e3bfdf);
+		nv50_graph_construct_gene_vfetch(ctx);
+		nv50_graph_construct_gene_eng2d(ctx);
+		nv50_graph_construct_gene_csched(ctx);
+		nv50_graph_construct_gene_ropm1(ctx);
+		nv50_graph_construct_gene_ropm2(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 2 */
 		ctx->ctxvals_pos = offset + 0x2;
-		switch (dev_priv->chipset) {
-		case 0x50:
-		case 0x92:
-			xf_emit(ctx, 0xa80, 0);
-			break;
-		case 0x84:
-			xf_emit(ctx, 0xa7e, 0);
-			break;
-		case 0x94:
-		case 0x96:
-			xf_emit(ctx, 0xa7c, 0);
-			break;
-		case 0x86:
-		case 0x98:
-			xf_emit(ctx, 0xa7a, 0);
-			break;
-		}
-		xf_emit(ctx, 1, 0x3fffff);
-		xf_emit(ctx, 2, 0);
-		xf_emit(ctx, 1, 0x1fff);
-		xf_emit(ctx, 0xe, 0);
-		nv50_graph_construct_gene_unk9(ctx);
-		nv50_graph_construct_gene_unk2(ctx);
-		nv50_graph_construct_gene_unk1(ctx);
-		nv50_graph_construct_gene_unk10(ctx);
+		nv50_graph_construct_gene_ccache(ctx);
+		nv50_graph_construct_gene_unk1cxx(ctx);
+		nv50_graph_construct_gene_strmout(ctx);
+		nv50_graph_construct_gene_unk14xx(ctx);
+		nv50_graph_construct_gene_unk10xx(ctx);
+		nv50_graph_construct_gene_unk34xx(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
@@ -1150,86 +1225,46 @@ nv50_graph_construct_xfer1(struct nouveau_grctx *ctx)
 	} else {
 		/* Strand 0 */
 		ctx->ctxvals_pos = offset;
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-			xf_emit(ctx, 0x385, 0);
-		else
-			xf_emit(ctx, 0x384, 0);
+		nv50_graph_construct_gene_dispatch(ctx);
 		nv50_graph_construct_gene_m2mf(ctx);
-		xf_emit(ctx, 0x950, 0);
-		nv50_graph_construct_gene_unk10(ctx);
-		xf_emit(ctx, 1, 0x0fac6881);
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-			xf_emit(ctx, 1, 1);
-			xf_emit(ctx, 3, 0);
-		}
-		nv50_graph_construct_gene_unk8(ctx);
-		if (dev_priv->chipset == 0xa0)
-			xf_emit(ctx, 0x189, 0);
-		else if (dev_priv->chipset == 0xa3)
-			xf_emit(ctx, 0xd5, 0);
-		else if (dev_priv->chipset == 0xa5)
-			xf_emit(ctx, 0x99, 0);
-		else if (dev_priv->chipset == 0xaa)
-			xf_emit(ctx, 0x65, 0);
-		else
-			xf_emit(ctx, 0x6d, 0);
-		nv50_graph_construct_gene_unk9(ctx);
+		nv50_graph_construct_gene_unk34xx(ctx);
+		nv50_graph_construct_gene_csched(ctx);
+		nv50_graph_construct_gene_unk1cxx(ctx);
+		nv50_graph_construct_gene_strmout(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 1 */
 		ctx->ctxvals_pos = offset + 1;
-		nv50_graph_construct_gene_unk1(ctx);
+		nv50_graph_construct_gene_unk10xx(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 2 */
 		ctx->ctxvals_pos = offset + 2;
-		if (dev_priv->chipset == 0xa0) {
-			nv50_graph_construct_gene_unk2(ctx);
-		}
-		xf_emit(ctx, 0x36, 0);
-		nv50_graph_construct_gene_unk5(ctx);
+		if (dev_priv->chipset == 0xa0)
+			nv50_graph_construct_gene_unk14xx(ctx);
+		nv50_graph_construct_gene_unk24xx(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 3 */
 		ctx->ctxvals_pos = offset + 3;
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 1, 1);
-		nv50_graph_construct_gene_unk6(ctx);
+		nv50_graph_construct_gene_vfetch(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 4 */
 		ctx->ctxvals_pos = offset + 4;
-		if (dev_priv->chipset == 0xa0)
-			xf_emit(ctx, 0xa80, 0);
-		else if (dev_priv->chipset == 0xa3)
-			xf_emit(ctx, 0xa7c, 0);
-		else
-			xf_emit(ctx, 0xa7a, 0);
-		xf_emit(ctx, 1, 0x3fffff);
-		xf_emit(ctx, 2, 0);
-		xf_emit(ctx, 1, 0x1fff);
+		nv50_graph_construct_gene_ccache(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
 		/* Strand 5 */
 		ctx->ctxvals_pos = offset + 5;
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 1, 0x0fac6881);
-		xf_emit(ctx, 0xb, 0);
-		xf_emit(ctx, 2, 0x4e3bfdf);
-		xf_emit(ctx, 3, 0);
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-			xf_emit(ctx, 1, 0x11);
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 2, 0x4e3bfdf);
-		xf_emit(ctx, 2, 0);
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-			xf_emit(ctx, 1, 0x11);
-		xf_emit(ctx, 1, 0);
+		nv50_graph_construct_gene_ropm2(ctx);
+		nv50_graph_construct_gene_ropm1(ctx);
+		/* per-ROP context */
 		for (i = 0; i < 8; i++)
 			if (units & (1<<(i+16)))
 				nv50_graph_construct_gene_ropc(ctx);
@@ -1238,10 +1273,9 @@ nv50_graph_construct_xfer1(struct nouveau_grctx *ctx)
 
 		/* Strand 6 */
 		ctx->ctxvals_pos = offset + 6;
-		nv50_graph_construct_gene_unk3(ctx);
-		xf_emit(ctx, 0xb, 0);
-		nv50_graph_construct_gene_unk4(ctx);
-		nv50_graph_construct_gene_unk7(ctx);
+		nv50_graph_construct_gene_zcull(ctx);
+		nv50_graph_construct_gene_clipid(ctx);
+		nv50_graph_construct_gene_eng2d(ctx);
 		if (units & (1 << 0))
 			nv50_graph_construct_xfer_tp(ctx);
 		if (units & (1 << 1))
@@ -1269,7 +1303,7 @@ nv50_graph_construct_xfer1(struct nouveau_grctx *ctx)
 			if (units & (1 << 9))
 				nv50_graph_construct_xfer_tp(ctx);
 		} else {
-			nv50_graph_construct_gene_unk2(ctx);
+			nv50_graph_construct_gene_unk14xx(ctx);
 		}
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
@@ -1290,9 +1324,70 @@ nv50_graph_construct_xfer1(struct nouveau_grctx *ctx)
  */
 
 static void
+nv50_graph_construct_gene_dispatch(struct nouveau_grctx *ctx)
+{
+	/* start of strand 0 */
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	/* SEEK */
+	if (dev_priv->chipset == 0x50)
+		xf_emit(ctx, 5, 0);
+	else if (!IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 6, 0);
+	else
+		xf_emit(ctx, 4, 0);
+	/* SEEK */
+	/* the PGRAPH's internal FIFO */
+	if (dev_priv->chipset == 0x50)
+		xf_emit(ctx, 8*3, 0);
+	else
+		xf_emit(ctx, 0x100*3, 0);
+	/* and another bonus slot?!? */
+	xf_emit(ctx, 3, 0);
+	/* and YET ANOTHER bonus slot? */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 3, 0);
+	/* SEEK */
+	/* CTX_SWITCH: caches of gr objects bound to subchannels. 8 values, last used index */
+	xf_emit(ctx, 9, 0);
+	/* SEEK */
+	xf_emit(ctx, 9, 0);
+	/* SEEK */
+	xf_emit(ctx, 9, 0);
+	/* SEEK */
+	xf_emit(ctx, 9, 0);
+	/* SEEK */
+	if (dev_priv->chipset < 0x90)
+		xf_emit(ctx, 4, 0);
+	/* SEEK */
+	xf_emit(ctx, 2, 0);
+	/* SEEK */
+	xf_emit(ctx, 6*2, 0);
+	xf_emit(ctx, 2, 0);
+	/* SEEK */
+	xf_emit(ctx, 2, 0);
+	/* SEEK */
+	xf_emit(ctx, 6*2, 0);
+	xf_emit(ctx, 2, 0);
+	/* SEEK */
+	if (dev_priv->chipset == 0x50)
+		xf_emit(ctx, 0x1c, 0);
+	else if (dev_priv->chipset < 0xa0)
+		xf_emit(ctx, 0x1e, 0);
+	else
+		xf_emit(ctx, 0x22, 0);
+	/* SEEK */
+	xf_emit(ctx, 0x15, 0);
+}
+
+static void
 nv50_graph_construct_gene_m2mf(struct nouveau_grctx *ctx)
 {
-	/* m2mf state */
+	/* Strand 0, right after dispatch */
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	int smallm2mf = 0;
+	if (dev_priv->chipset < 0x92 || dev_priv->chipset == 0x98)
+		smallm2mf = 1;
+	/* SEEK */
 	xf_emit (ctx, 1, 0);		/* DMA_NOTIFY instance >> 4 */
 	xf_emit (ctx, 1, 0);		/* DMA_BUFFER_IN instance >> 4 */
 	xf_emit (ctx, 1, 0);		/* DMA_BUFFER_OUT instance >> 4 */
@@ -1319,427 +1414,975 @@ nv50_graph_construct_gene_m2mf(struct nouveau_grctx *ctx)
 	xf_emit (ctx, 1, 0);		/* TILING_POSITION_OUT */
 	xf_emit (ctx, 1, 0);		/* OFFSET_IN_HIGH */
 	xf_emit (ctx, 1, 0);		/* OFFSET_OUT_HIGH */
+	/* SEEK */
+	if (smallm2mf)
+		xf_emit(ctx, 0x40, 0);	/* 20 * ffffffff, 3ffff */
+	else
+		xf_emit(ctx, 0x100, 0);	/* 80 * ffffffff, 3ffff */
+	xf_emit(ctx, 4, 0);		/* 1f/7f, 0, 1f/7f, 0 [1f for smallm2mf, 7f otherwise] */
+	/* SEEK */
+	if (smallm2mf)
+		xf_emit(ctx, 0x400, 0);	/* ffffffff */
+	else
+		xf_emit(ctx, 0x800, 0);	/* ffffffff */
+	xf_emit(ctx, 4, 0);		/* ff/1ff, 0, 0, 0 [ff for smallm2mf, 1ff otherwise] */
+	/* SEEK */
+	xf_emit(ctx, 0x40, 0);		/* 20 * bits ffffffff, 3ffff */
+	xf_emit(ctx, 0x6, 0);		/* 1f, 0, 1f, 0, 1f, 0 */
 }
 
 static void
-nv50_graph_construct_gene_unk1(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_ccache(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* end of area 2 on pre-NVA0, area 1 on NVAx */
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x80);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0x80c14);
-	xf_emit(ctx, 1, 0);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0x3ff);
-	else
-		xf_emit(ctx, 1, 0x7ff);
+	xf_emit(ctx, 2, 0);		/* RO */
+	xf_emit(ctx, 0x800, 0);		/* ffffffff */
 	switch (dev_priv->chipset) {
 	case 0x50:
-	case 0x86:
-	case 0x98:
-	case 0xaa:
-	case 0xac:
-		xf_emit(ctx, 0x542, 0);
+	case 0x92:
+	case 0xa0:
+		xf_emit(ctx, 0x2b, 0);
 		break;
 	case 0x84:
-	case 0x92:
+		xf_emit(ctx, 0x29, 0);
+		break;
 	case 0x94:
 	case 0x96:
-		xf_emit(ctx, 0x942, 0);
-		break;
-	case 0xa0:
 	case 0xa3:
-		xf_emit(ctx, 0x2042, 0);
+		xf_emit(ctx, 0x27, 0);
 		break;
+	case 0x86:
+	case 0x98:
 	case 0xa5:
 	case 0xa8:
-		xf_emit(ctx, 0x842, 0);
+	case 0xaa:
+	case 0xac:
+	case 0xaf:
+		xf_emit(ctx, 0x25, 0);
 		break;
 	}
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x80);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x27);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x26);
-	xf_emit(ctx, 3, 0);
+	/* CB bindings, 0x80 of them. first word is address >> 8, second is
+	 * size >> 4 | valid << 24 */
+	xf_emit(ctx, 0x100, 0);		/* ffffffff CB_DEF */
+	xf_emit(ctx, 1, 0);		/* 0000007f CB_ADDR_BUFFER */
+	xf_emit(ctx, 1, 0);		/* 0 */
+	xf_emit(ctx, 0x30, 0);		/* ff SET_PROGRAM_CB */
+	xf_emit(ctx, 1, 0);		/* 3f last SET_PROGRAM_CB */
+	xf_emit(ctx, 4, 0);		/* RO */
+	xf_emit(ctx, 0x100, 0);		/* ffffffff */
+	xf_emit(ctx, 8, 0);		/* 1f, 0, 0, ... */
+	xf_emit(ctx, 8, 0);		/* ffffffff */
+	xf_emit(ctx, 4, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 3 */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_CODE_CB */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_TIC */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_TSC */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINKED_TSC */
+	xf_emit(ctx, 1, 0);		/* 000000ff TIC_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff TIC_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0x3fffff);	/* 003fffff TIC_LIMIT */
+	xf_emit(ctx, 1, 0);		/* 000000ff TSC_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff TSC_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0x1fff);	/* 000fffff TSC_LIMIT */
+	xf_emit(ctx, 1, 0);		/* 000000ff VP_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff VP_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 00ffffff VP_START_ID */
+	xf_emit(ctx, 1, 0);		/* 000000ff CB_DEF_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff CB_DEF_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 000000ff GP_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff GP_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 00ffffff GP_START_ID */
+	xf_emit(ctx, 1, 0);		/* 000000ff FP_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff FP_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 00ffffff FP_START_ID */
 }
 
 static void
-nv50_graph_construct_gene_unk10(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_unk10xx(struct nouveau_grctx *ctx)
 {
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	int i;
 	/* end of area 2 on pre-NVA0, area 1 on NVAx */
-	xf_emit(ctx, 0x10, 0x04000000);
-	xf_emit(ctx, 0x24, 0);
-	xf_emit(ctx, 2, 0x04e3bfdf);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x1fe21);
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x80);		/* 0000ffff GP_VERTEX_OUTPUT_COUNT */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 0x80c14);	/* 01ffffff SEMANTIC_COLOR */
+	xf_emit(ctx, 1, 0);		/* 00000001 VERTEX_TWO_SIDE_ENABLE */
+	if (dev_priv->chipset == 0x50)
+		xf_emit(ctx, 1, 0x3ff);
+	else
+		xf_emit(ctx, 1, 0x7ff);	/* 000007ff */
+	xf_emit(ctx, 1, 0);		/* 111/113 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	for (i = 0; i < 8; i++) {
+		switch (dev_priv->chipset) {
+		case 0x50:
+		case 0x86:
+		case 0x98:
+		case 0xaa:
+		case 0xac:
+			xf_emit(ctx, 0xa0, 0);	/* ffffffff */
+			break;
+		case 0x84:
+		case 0x92:
+		case 0x94:
+		case 0x96:
+			xf_emit(ctx, 0x120, 0);
+			break;
+		case 0xa5:
+		case 0xa8:
+			xf_emit(ctx, 0x100, 0);	/* ffffffff */
+			break;
+		case 0xa0:
+		case 0xa3:
+		case 0xaf:
+			xf_emit(ctx, 0x400, 0);	/* ffffffff */
+			break;
+		}
+		xf_emit(ctx, 4, 0);	/* 3f, 0, 0, 0 */
+		xf_emit(ctx, 4, 0);	/* ffffffff */
+	}
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x80);		/* 0000ffff GP_VERTEX_OUTPUT_COUNT */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_REG_ALLOC_TEMP */
+	xf_emit(ctx, 1, 1);		/* 00000001 RASTERIZE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0x27);		/* 000000ff UNK0FD4 */
+	xf_emit(ctx, 1, 0);		/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0x26);		/* 000000ff SEMANTIC_LAYER */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+}
+
+static void
+nv50_graph_construct_gene_unk34xx(struct nouveau_grctx *ctx)
+{
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	/* end of area 2 on pre-NVA0, area 1 on NVAx */
+	xf_emit(ctx, 1, 0);		/* 00000001 VIEWPORT_CLIP_RECTS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000003 VIEWPORT_CLIP_MODE */
+	xf_emit(ctx, 0x10, 0x04000000);	/* 07ffffff VIEWPORT_CLIP_HORIZ*8, VIEWPORT_CLIP_VERT*8 */
+	xf_emit(ctx, 1, 0);		/* 00000001 POLYGON_STIPPLE_ENABLE */
+	xf_emit(ctx, 0x20, 0);		/* ffffffff POLYGON_STIPPLE */
+	xf_emit(ctx, 2, 0);		/* 00007fff WINDOW_OFFSET_XY */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0x04e3bfdf);	/* ffffffff UNK0D64 */
+	xf_emit(ctx, 1, 0x04e3bfdf);	/* ffffffff UNK0DF4 */
+	xf_emit(ctx, 1, 0);		/* 00000003 WINDOW_ORIGIN */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0x1fe21);	/* 0001ffff tesla UNK0FAC */
+	if (dev_priv->chipset >= 0xa0)
+		xf_emit(ctx, 1, 0x0fac6881);
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 1);
+		xf_emit(ctx, 3, 0);
+	}
 }
 
 static void
-nv50_graph_construct_gene_unk2(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_unk14xx(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
 	/* middle of area 2 on pre-NVA0, beginning of area 2 on NVA0, area 7 on >NVA0 */
 	if (dev_priv->chipset != 0x50) {
-		xf_emit(ctx, 5, 0);
-		xf_emit(ctx, 1, 0x80c14);
-		xf_emit(ctx, 2, 0);
-		xf_emit(ctx, 1, 0x804);
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 2, 4);
-		xf_emit(ctx, 1, 0x8100c12);
+		xf_emit(ctx, 5, 0);		/* ffffffff */
+		xf_emit(ctx, 1, 0x80c14);	/* 01ffffff SEMANTIC_COLOR */
+		xf_emit(ctx, 1, 0);		/* 00000001 */
+		xf_emit(ctx, 1, 0);		/* 000003ff */
+		xf_emit(ctx, 1, 0x804);		/* 00000fff SEMANTIC_CLIP */
+		xf_emit(ctx, 1, 0);		/* 00000001 */
+		xf_emit(ctx, 2, 4);		/* 7f, ff */
+		xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
 	}
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x10);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 3, 0);
-	else
-		xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x804);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x1a);
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 4);			/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);			/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);			/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x10);			/* 7f/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);			/* 000000ff VP_CLIP_DISTANCE_ENABLE */
 	if (dev_priv->chipset != 0x50)
-		xf_emit(ctx, 1, 0x7f);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x80c14);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 6, 0);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0x3ff);
-	else
-		xf_emit(ctx, 1, 0x7ff);
-	xf_emit(ctx, 1, 0x80c14);
-	xf_emit(ctx, 0x38, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 0x38, 0);
-	xf_emit(ctx, 2, 0x88);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 0x16, 0);
-	xf_emit(ctx, 1, 0x26);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x3f800000);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 4, 0);
-	else
-		xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x1a);
-	xf_emit(ctx, 1, 0x10);
+		xf_emit(ctx, 1, 0);		/* 3ff */
+	xf_emit(ctx, 1, 0);			/* 000000ff tesla UNK1940 */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK0D7C */
+	xf_emit(ctx, 1, 0x804);			/* 00000fff SEMANTIC_CLIP */
+	xf_emit(ctx, 1, 1);			/* 00000001 VIEWPORT_TRANSFORM_EN */
+	xf_emit(ctx, 1, 0x1a);			/* 0000001f POLYGON_MODE */
 	if (dev_priv->chipset != 0x50)
-		xf_emit(ctx, 0x28, 0);
+		xf_emit(ctx, 1, 0x7f);		/* 000000ff tesla UNK0FFC */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 1);			/* 00000001 SHADE_MODEL */
+	xf_emit(ctx, 1, 0x80c14);		/* 01ffffff SEMANTIC_COLOR */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0x8100c12);		/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 4);			/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);			/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);			/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x10);			/* 7f/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK0D7C */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK0F8C */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 1);			/* 00000001 VIEWPORT_TRANSFORM_EN */
+	xf_emit(ctx, 1, 0x8100c12);		/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 4, 0);			/* ffffffff NOPERSPECTIVE_BITMAP */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0);			/* 0000000f */
+	if (dev_priv->chipset == 0x50)
+		xf_emit(ctx, 1, 0x3ff);		/* 000003ff tesla UNK0D68 */
 	else
-		xf_emit(ctx, 0x25, 0);
-	xf_emit(ctx, 1, 0x52);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x26);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x1a);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x00ffff00);
-	xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 1, 0x7ff);		/* 000007ff tesla UNK0D68 */
+	xf_emit(ctx, 1, 0x80c14);		/* 01ffffff SEMANTIC_COLOR */
+	xf_emit(ctx, 1, 0);			/* 00000001 VERTEX_TWO_SIDE_ENABLE */
+	xf_emit(ctx, 0x30, 0);			/* ffffffff VIEWPORT_SCALE: X0, Y0, Z0, X1, Y1, ... */
+	xf_emit(ctx, 3, 0);			/* f, 0, 0 */
+	xf_emit(ctx, 3, 0);			/* ffffffff last VIEWPORT_SCALE? */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 1);			/* 00000001 VIEWPORT_TRANSFORM_EN */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1924 */
+	xf_emit(ctx, 1, 0x10);			/* 000000ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);			/* 00000001 */
+	xf_emit(ctx, 0x30, 0);			/* ffffffff VIEWPORT_TRANSLATE */
+	xf_emit(ctx, 3, 0);			/* f, 0, 0 */
+	xf_emit(ctx, 3, 0);			/* ffffffff */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 2, 0x88);			/* 000001ff tesla UNK19D8 */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1924 */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 4);			/* 0000000f CULL_MODE */
+	xf_emit(ctx, 2, 0);			/* 07ffffff SCREEN_SCISSOR */
+	xf_emit(ctx, 2, 0);			/* 00007fff WINDOW_OFFSET_XY */
+	xf_emit(ctx, 1, 0);			/* 00000003 WINDOW_ORIGIN */
+	xf_emit(ctx, 0x10, 0);			/* 00000001 SCISSOR_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0x26);			/* 000000ff SEMANTIC_LAYER */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0);			/* 0000000f */
+	xf_emit(ctx, 1, 0x3f800000);		/* ffffffff LINE_WIDTH */
+	xf_emit(ctx, 1, 0);			/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 00000001 LINE_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 0);		/* 00000001 */
+	xf_emit(ctx, 1, 0x1a);			/* 0000001f POLYGON_MODE */
+	xf_emit(ctx, 1, 0x10);			/* 000000ff VIEW_VOLUME_CLIP_CTRL */
+	if (dev_priv->chipset != 0x50) {
+		xf_emit(ctx, 1, 0);		/* ffffffff */
+		xf_emit(ctx, 1, 0);		/* 00000001 */
+		xf_emit(ctx, 1, 0);		/* 000003ff */
+	}
+	xf_emit(ctx, 0x20, 0);			/* 10xbits ffffffff, 3fffff. SCISSOR_* */
+	xf_emit(ctx, 1, 0);			/* f */
+	xf_emit(ctx, 1, 0);			/* 0? */
+	xf_emit(ctx, 1, 0);			/* ffffffff */
+	xf_emit(ctx, 1, 0);			/* 003fffff */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0x52);			/* 000001ff SEMANTIC_PTSZ */
+	xf_emit(ctx, 1, 0);			/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0x26);			/* 000000ff SEMANTIC_LAYER */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 4);			/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);			/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);			/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x1a);			/* 0000001f POLYGON_MODE */
+	xf_emit(ctx, 1, 0);			/* 00000001 LINE_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0x00ffff00);		/* 00ffffff LINE_STIPPLE_PATTERN */
+	xf_emit(ctx, 1, 0);			/* 0000000f */
 }
 
 static void
-nv50_graph_construct_gene_unk3(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_zcull(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* end of area 0 on pre-NVA0, beginning of area 6 on NVAx */
-	xf_emit(ctx, 1, 0x3f);
-	xf_emit(ctx, 0xa, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 0x04000000);
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 4);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 0x10, 0);
-	else
-		xf_emit(ctx, 0x11, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x1001);
-	xf_emit(ctx, 4, 0xffff);
-	xf_emit(ctx, 0x20, 0);
-	xf_emit(ctx, 0x10, 0x3f800000);
-	xf_emit(ctx, 1, 0x10);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0);
-	else
-		xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 2, 0);
+	/* end of strand 0 on pre-NVA0, beginning of strand 6 on NVAx */
+	/* SEEK */
+	xf_emit(ctx, 1, 0x3f);		/* 0000003f UNK1590 */
+	xf_emit(ctx, 1, 0);		/* 00000001 ALPHA_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_BACK_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_REF */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_BACK_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 2);		/* 00000003 tesla UNK143C */
+	xf_emit(ctx, 2, 0x04000000);	/* 07ffffff tesla UNK0D6C */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0);		/* 00000001 CLIPID_ENABLE */
+	xf_emit(ctx, 2, 0);		/* ffffffff DEPTH_BOUNDS */
+	xf_emit(ctx, 1, 0);		/* 00000001 */
+	xf_emit(ctx, 1, 0);		/* 00000007 DEPTH_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 0000000f CULL_MODE */
+	xf_emit(ctx, 1, 0);		/* 0000ffff */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK0FB0 */
+	xf_emit(ctx, 1, 0);		/* 00000001 POLYGON_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0);		/* 000000ff CLEAR_STENCIL */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_FRONT_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_REF */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_FRONT_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff CLEAR_DEPTH */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	if (dev_priv->chipset != 0x50)
+		xf_emit(ctx, 1, 0);	/* 00000003 tesla UNK1108 */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0x1001);	/* 00001fff ZETA_ARRAY_MODE */
+	/* SEEK */
+	xf_emit(ctx, 4, 0xffff);	/* 0000ffff MSAA_MASK */
+	xf_emit(ctx, 0x10, 0);		/* 00000001 SCISSOR_ENABLE */
+	xf_emit(ctx, 0x10, 0);		/* ffffffff DEPTH_RANGE_NEAR */
+	xf_emit(ctx, 0x10, 0x3f800000);	/* ffffffff DEPTH_RANGE_FAR */
+	xf_emit(ctx, 1, 0x10);		/* 7f/ff/3ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);		/* 00000001 VIEWPORT_CLIP_RECTS_EN */
+	xf_emit(ctx, 1, 3);		/* 00000003 FP_CTRL_UNK196C */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK1968 */
+	if (dev_priv->chipset != 0x50)
+		xf_emit(ctx, 1, 0);	/* 0fffffff tesla UNK1104 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK151C */
 }
 
 static void
-nv50_graph_construct_gene_unk4(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_clipid(struct nouveau_grctx *ctx)
 {
-	/* middle of area 0 on pre-NVA0, middle of area 6 on NVAx */
-	xf_emit(ctx, 2, 0x04000000);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x80);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x80);
-	xf_emit(ctx, 1, 0);
+	/* middle of strand 0 on pre-NVA0 [after 24xx], middle of area 6 on NVAx */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000007 UNK0FB4 */
+	/* SEEK */
+	xf_emit(ctx, 4, 0);		/* 07ffffff CLIPID_REGION_HORIZ */
+	xf_emit(ctx, 4, 0);		/* 07ffffff CLIPID_REGION_VERT */
+	xf_emit(ctx, 2, 0);		/* 07ffffff SCREEN_SCISSOR */
+	xf_emit(ctx, 2, 0x04000000);	/* 07ffffff UNK1508 */
+	xf_emit(ctx, 1, 0);		/* 00000001 CLIPID_ENABLE */
+	xf_emit(ctx, 1, 0x80);		/* 00003fff CLIPID_WIDTH */
+	xf_emit(ctx, 1, 0);		/* 000000ff CLIPID_ID */
+	xf_emit(ctx, 1, 0);		/* 000000ff CLIPID_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff CLIPID_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0x80);		/* 00003fff CLIPID_HEIGHT */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_CLIPID */
 }
 
 static void
-nv50_graph_construct_gene_unk5(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_unk24xx(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* middle of area 0 on pre-NVA0 [after m2mf], end of area 2 on NVAx */
-	xf_emit(ctx, 2, 4);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0x1c4d, 0);
+	int i;
+	/* middle of strand 0 on pre-NVA0 [after m2mf], end of strand 2 on NVAx */
+	/* SEEK */
+	xf_emit(ctx, 0x33, 0);
+	/* SEEK */
+	xf_emit(ctx, 2, 0);
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 4, 0);	/* RO */
+		xf_emit(ctx, 0xe10, 0); /* 190 * 9: 8*ffffffff, 7ff */
+		xf_emit(ctx, 1, 0);	/* 1ff */
+		xf_emit(ctx, 8, 0);	/* 0? */
+		xf_emit(ctx, 9, 0);	/* ffffffff, 7ff */
+
+		xf_emit(ctx, 4, 0);	/* RO */
+		xf_emit(ctx, 0xe10, 0); /* 190 * 9: 8*ffffffff, 7ff */
+		xf_emit(ctx, 1, 0);	/* 1ff */
+		xf_emit(ctx, 8, 0);	/* 0? */
+		xf_emit(ctx, 9, 0);	/* ffffffff, 7ff */
+	}
 	else
-		xf_emit(ctx, 0x1c4b, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0x8100c12);
+	{
+		xf_emit(ctx, 0xc, 0);	/* RO */
+		/* SEEK */
+		xf_emit(ctx, 0xe10, 0); /* 190 * 9: 8*ffffffff, 7ff */
+		xf_emit(ctx, 1, 0);	/* 1ff */
+		xf_emit(ctx, 8, 0);	/* 0? */
+
+		/* SEEK */
+		xf_emit(ctx, 0xc, 0);	/* RO */
+		/* SEEK */
+		xf_emit(ctx, 0xe10, 0); /* 190 * 9: 8*ffffffff, 7ff */
+		xf_emit(ctx, 1, 0);	/* 1ff */
+		xf_emit(ctx, 8, 0);	/* 0? */
+	}
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
 	if (dev_priv->chipset != 0x50)
-		xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x80c14);
-	xf_emit(ctx, 1, 1);
+		xf_emit(ctx, 1, 3);	/* 00000003 tesla UNK1100 */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 0);		/* 0000000f VP_GP_BUILTIN_ATTR_EN */
+	xf_emit(ctx, 1, 0x80c14);	/* 01ffffff SEMANTIC_COLOR */
+	xf_emit(ctx, 1, 1);		/* 00000001 */
+	/* SEEK */
 	if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0x80c14);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 1, 0x27);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x3c1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x16, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 2, 4);	/* 000000ff */
+	xf_emit(ctx, 1, 0x80c14);	/* 01ffffff SEMANTIC_COLOR */
+	xf_emit(ctx, 1, 0);		/* 00000001 VERTEX_TWO_SIDE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 POINT_SPRITE_ENABLE */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 0x27);		/* 000000ff SEMANTIC_PRIM_ID */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f */
+	xf_emit(ctx, 1, 1);		/* 00000001 */
+	for (i = 0; i < 10; i++) {
+		/* SEEK */
+		xf_emit(ctx, 0x40, 0);		/* ffffffff */
+		xf_emit(ctx, 0x10, 0);		/* 3, 0, 0.... */
+		xf_emit(ctx, 0x10, 0);		/* ffffffff */
+	}
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000001 POINT_SPRITE_CTRL */
+	xf_emit(ctx, 1, 1);		/* 00000001 */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 4, 0);		/* ffffffff NOPERSPECTIVE_BITMAP */
+	xf_emit(ctx, 0x10, 0);		/* 00ffffff POINT_COORD_REPLACE_MAP */
+	xf_emit(ctx, 1, 0);		/* 00000003 WINDOW_ORIGIN */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
+	if (dev_priv->chipset != 0x50)
+		xf_emit(ctx, 1, 0);	/* 000003ff */
 }
 
 static void
-nv50_graph_construct_gene_unk6(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_vfetch(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* beginning of area 1 on pre-NVA0 [after m2mf], area 3 on NVAx */
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0xf);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 8, 0);
-	else
-		xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x20);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0x11, 0);
+	int acnt = 0x10, rep, i;
+	/* beginning of strand 1 on pre-NVA0, strand 3 on NVAx */
+	if (IS_NVA3F(dev_priv->chipset))
+		acnt = 0x20;
+	/* SEEK */
+	if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 1, 0);	/* ffffffff tesla UNK13A4 */
+		xf_emit(ctx, 1, 1);	/* 00000fff tesla UNK1318 */
+	}
+	xf_emit(ctx, 1, 0);		/* ffffffff VERTEX_BUFFER_FIRST */
+	xf_emit(ctx, 1, 0);		/* 00000001 PRIMITIVE_RESTART_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK0DE8 */
+	xf_emit(ctx, 1, 0);		/* ffffffff PRIMITIVE_RESTART_INDEX */
+	xf_emit(ctx, 1, 0xf);		/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, (acnt/8)-1, 0);	/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, acnt/8, 0);	/* ffffffff VTX_ATR_MASK_UNK0DD0 */
+	xf_emit(ctx, 1, 0);		/* 0000000f VP_GP_BUILTIN_ATTR_EN */
+	xf_emit(ctx, 1, 0x20);		/* 0000ffff tesla UNK129C */
+	xf_emit(ctx, 1, 0);		/* 000000ff turing UNK370??? */
+	xf_emit(ctx, 1, 0);		/* 0000ffff turing USER_PARAM_COUNT */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0xb, 0);	/* RO */
 	else if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 0xf, 0);
+		xf_emit(ctx, 0x9, 0);	/* RO */
 	else
-		xf_emit(ctx, 0xe, 0);
-	xf_emit(ctx, 1, 0x1a);
-	xf_emit(ctx, 0xd, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 8);
-	xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 0x8, 0);	/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 00000001 EDGE_FLAG */
+	xf_emit(ctx, 1, 0);		/* 00000001 PROVOKING_VERTEX_LAST */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0x1a);		/* 0000001f POLYGON_MODE */
+	/* SEEK */
+	xf_emit(ctx, 0xc, 0);		/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 7f/ff */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* 0000000f VP_GP_BUILTIN_ATTR_EN */
+	xf_emit(ctx, 1, 4);		/* 000001ff UNK1A28 */
+	xf_emit(ctx, 1, 8);		/* 000001ff UNK0DF0 */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
 	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0x3ff);
+		xf_emit(ctx, 1, 0x3ff);	/* 3ff tesla UNK0D68 */
 	else
-		xf_emit(ctx, 1, 0x7ff);
+		xf_emit(ctx, 1, 0x7ff);	/* 7ff tesla UNK0D68 */
 	if (dev_priv->chipset == 0xa8)
-		xf_emit(ctx, 1, 0x1e00);
-	xf_emit(ctx, 0xc, 0);
-	xf_emit(ctx, 1, 0xf);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 0x125, 0);
-	else if (dev_priv->chipset < 0xa0)
-		xf_emit(ctx, 0x126, 0);
-	else if (dev_priv->chipset == 0xa0 || dev_priv->chipset >= 0xaa)
-		xf_emit(ctx, 0x124, 0);
+		xf_emit(ctx, 1, 0x1e00);	/* 7fff */
+	/* SEEK */
+	xf_emit(ctx, 0xc, 0);		/* RO or close */
+	/* SEEK */
+	xf_emit(ctx, 1, 0xf);		/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, (acnt/8)-1, 0);	/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, 1, 0);		/* 0000000f VP_GP_BUILTIN_ATTR_EN */
+	if (dev_priv->chipset > 0x50 && dev_priv->chipset < 0xa0)
+		xf_emit(ctx, 2, 0);	/* ffffffff */
 	else
-		xf_emit(ctx, 0x1f7, 0);
-	xf_emit(ctx, 1, 0xf);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 3, 0);
+		xf_emit(ctx, 1, 0);	/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK0FD8 */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 0x10, 0);	/* 0? */
+		xf_emit(ctx, 2, 0);	/* weird... */
+		xf_emit(ctx, 2, 0);	/* RO */
+	} else {
+		xf_emit(ctx, 8, 0);	/* 0? */
+		xf_emit(ctx, 1, 0);	/* weird... */
+		xf_emit(ctx, 2, 0);	/* RO */
+	}
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* ffffffff VB_ELEMENT_BASE */
+	xf_emit(ctx, 1, 0);		/* ffffffff UNK1438 */
+	xf_emit(ctx, acnt, 0);		/* 1 tesla UNK1000 */
+	if (dev_priv->chipset >= 0xa0)
+		xf_emit(ctx, 1, 0);	/* ffffffff tesla UNK1118? */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* ffffffff VERTEX_ARRAY_UNK90C */
+	xf_emit(ctx, 1, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* ffffffff VERTEX_ARRAY_UNK90C */
+	xf_emit(ctx, 1, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* RO */
+	xf_emit(ctx, 2, 0);		/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK111C? */
+	xf_emit(ctx, 1, 0);		/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 000000ff UNK15F4_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff UNK15F4_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 000000ff UNK0F84_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff UNK0F84_ADDRESS_LOW */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* 00003fff VERTEX_ARRAY_ATTRIB_OFFSET */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* 00000fff VERTEX_ARRAY_STRIDE */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* ffffffff VERTEX_ARRAY_LOW */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* 000000ff VERTEX_ARRAY_HIGH */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* ffffffff VERTEX_LIMIT_LOW */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	xf_emit(ctx, acnt, 0);		/* 000000ff VERTEX_LIMIT_HIGH */
+	xf_emit(ctx, 3, 0);		/* f/1f */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, acnt, 0);		/* f */
+		xf_emit(ctx, 3, 0);		/* f/1f */
+	}
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 2, 0);	/* RO */
+	else
+		xf_emit(ctx, 5, 0);	/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* ffff DMA_VTXBUF */
+	/* SEEK */
+	if (dev_priv->chipset < 0xa0) {
+		xf_emit(ctx, 0x41, 0);	/* RO */
+		/* SEEK */
+		xf_emit(ctx, 0x11, 0);	/* RO */
+	} else if (!IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0x50, 0);	/* RO */
 	else
-		xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0xa1, 0);
+		xf_emit(ctx, 0x58, 0);	/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0xf);		/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, (acnt/8)-1, 0);	/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, 1, 1);		/* 1 UNK0DEC */
+	/* SEEK */
+	xf_emit(ctx, acnt*4, 0);	/* ffffffff VTX_ATTR */
+	xf_emit(ctx, 4, 0);		/* f/1f, 0, 0, 0 */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0x1d, 0);	/* RO */
 	else
-		xf_emit(ctx, 0x5a, 0);
-	xf_emit(ctx, 1, 0xf);
+		xf_emit(ctx, 0x16, 0);	/* RO */
+	/* SEEK */
+	xf_emit(ctx, 1, 0xf);		/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, (acnt/8)-1, 0);	/* ffffffff VP_ATTR_EN */
+	/* SEEK */
 	if (dev_priv->chipset < 0xa0)
-		xf_emit(ctx, 0x834, 0);
-	else if (dev_priv->chipset == 0xa0)
-		xf_emit(ctx, 0x1873, 0);
-	else if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0x8ba, 0);
+		xf_emit(ctx, 8, 0);	/* RO */
+	else if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0xc, 0);	/* RO */
+	else
+		xf_emit(ctx, 7, 0);	/* RO */
+	/* SEEK */
+	xf_emit(ctx, 0xa, 0);		/* RO */
+	if (dev_priv->chipset == 0xa0)
+		rep = 0xc;
+	else
+		rep = 4;
+	for (i = 0; i < rep; i++) {
+		/* SEEK */
+		if (IS_NVA3F(dev_priv->chipset))
+			xf_emit(ctx, 0x20, 0);	/* ffffffff */
+		xf_emit(ctx, 0x200, 0);	/* ffffffff */
+		xf_emit(ctx, 4, 0);	/* 7f/ff, 0, 0, 0 */
+		xf_emit(ctx, 4, 0);	/* ffffffff */
+	}
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 113/111 */
+	xf_emit(ctx, 1, 0xf);		/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, (acnt/8)-1, 0);	/* ffffffff VP_ATTR_EN */
+	xf_emit(ctx, acnt/8, 0);	/* ffffffff VTX_ATTR_MASK_UNK0DD0 */
+	xf_emit(ctx, 1, 0);		/* 0000000f VP_GP_BUILTIN_ATTR_EN */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	/* SEEK */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 7, 0);	/* weird... */
 	else
-		xf_emit(ctx, 0x833, 0);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 0xf, 0);
+		xf_emit(ctx, 5, 0);	/* weird... */
 }
 
 static void
-nv50_graph_construct_gene_unk7(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_eng2d(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* middle of area 1 on pre-NVA0 [after m2mf], middle of area 6 on NVAx */
-	xf_emit(ctx, 2, 0);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 2, 1);
-	else
-		xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0x100);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 8);
-	xf_emit(ctx, 5, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 3, 1);
-	xf_emit(ctx, 1, 0xcf);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 6, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 3, 1);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x15);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x4444480);
-	xf_emit(ctx, 0x37, 0);
+	/* middle of strand 1 on pre-NVA0 [after vfetch], middle of strand 6 on NVAx */
+	/* SEEK */
+	xf_emit(ctx, 2, 0);		/* 0001ffff CLIP_X, CLIP_Y */
+	xf_emit(ctx, 2, 0);		/* 0000ffff CLIP_W, CLIP_H */
+	xf_emit(ctx, 1, 0);		/* 00000001 CLIP_ENABLE */
+	if (dev_priv->chipset < 0xa0) {
+		/* this is useless on everything but the original NV50,
+		 * guess they forgot to nuke it. Or just didn't bother. */
+		xf_emit(ctx, 2, 0);	/* 0000ffff IFC_CLIP_X, Y */
+		xf_emit(ctx, 2, 1);	/* 0000ffff IFC_CLIP_W, H */
+		xf_emit(ctx, 1, 0);	/* 00000001 IFC_CLIP_ENABLE */
+	}
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	xf_emit(ctx, 1, 0x100);		/* 0001ffff DST_WIDTH */
+	xf_emit(ctx, 1, 0x100);		/* 0001ffff DST_HEIGHT */
+	xf_emit(ctx, 1, 0x11);		/* 3f[NV50]/7f[NV84+] DST_FORMAT */
+	xf_emit(ctx, 1, 0);		/* 0001ffff DRAW_POINT_X */
+	xf_emit(ctx, 1, 8);		/* 0000000f DRAW_UNK58C */
+	xf_emit(ctx, 1, 0);		/* 000fffff SIFC_DST_X_FRACT */
+	xf_emit(ctx, 1, 0);		/* 0001ffff SIFC_DST_X_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff SIFC_DST_Y_FRACT */
+	xf_emit(ctx, 1, 0);		/* 0001ffff SIFC_DST_Y_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff SIFC_DX_DU_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff SIFC_DX_DU_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff SIFC_DY_DV_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff SIFC_DY_DV_INT */
+	xf_emit(ctx, 1, 1);		/* 0000ffff SIFC_WIDTH */
+	xf_emit(ctx, 1, 1);		/* 0000ffff SIFC_HEIGHT */
+	xf_emit(ctx, 1, 0xcf);		/* 000000ff SIFC_FORMAT */
+	xf_emit(ctx, 1, 2);		/* 00000003 SIFC_BITMAP_UNK808 */
+	xf_emit(ctx, 1, 0);		/* 00000003 SIFC_BITMAP_LINE_PACK_MODE */
+	xf_emit(ctx, 1, 0);		/* 00000001 SIFC_BITMAP_LSB_FIRST */
+	xf_emit(ctx, 1, 0);		/* 00000001 SIFC_BITMAP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000ffff BLIT_DST_X */
+	xf_emit(ctx, 1, 0);		/* 0000ffff BLIT_DST_Y */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_DU_DX_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff BLIT_DU_DX_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_DV_DY_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff BLIT_DV_DY_INT */
+	xf_emit(ctx, 1, 1);		/* 0000ffff BLIT_DST_W */
+	xf_emit(ctx, 1, 1);		/* 0000ffff BLIT_DST_H */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_SRC_X_FRACT */
+	xf_emit(ctx, 1, 0);		/* 0001ffff BLIT_SRC_X_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_SRC_Y_FRACT */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK888 */
+	xf_emit(ctx, 1, 4);		/* 0000003f UNK884 */
+	xf_emit(ctx, 1, 0);		/* 00000007 UNK880 */
+	xf_emit(ctx, 1, 1);		/* 0000001f tesla UNK0FB8 */
+	xf_emit(ctx, 1, 0x15);		/* 000000ff tesla UNK128C */
+	xf_emit(ctx, 2, 0);		/* 00000007, ffff0ff3 */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK260 */
+	xf_emit(ctx, 1, 0x4444480);	/* 1fffffff UNK870 */
+	/* SEEK */
+	xf_emit(ctx, 0x10, 0);
+	/* SEEK */
+	xf_emit(ctx, 0x27, 0);
 }
 
 static void
-nv50_graph_construct_gene_unk8(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_csched(struct nouveau_grctx *ctx)
 {
-	/* middle of area 1 on pre-NVA0 [after m2mf], middle of area 0 on NVAx */
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x100);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x10001);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x10001);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x10001);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 2);
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	/* middle of strand 1 on pre-NVA0 [after eng2d], middle of strand 0 on NVAx */
+	/* SEEK */
+	xf_emit(ctx, 2, 0);		/* 00007fff WINDOW_OFFSET_XY... what is it doing here??? */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1924 */
+	xf_emit(ctx, 1, 0);		/* 00000003 WINDOW_ORIGIN */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 0);		/* 000003ff */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* ffffffff turing UNK364 */
+	xf_emit(ctx, 1, 0);		/* 0000000f turing UNK36C */
+	xf_emit(ctx, 1, 0);		/* 0000ffff USER_PARAM_COUNT */
+	xf_emit(ctx, 1, 0x100);		/* 00ffffff turing UNK384 */
+	xf_emit(ctx, 1, 0);		/* 0000000f turing UNK2A0 */
+	xf_emit(ctx, 1, 0);		/* 0000ffff GRIDID */
+	xf_emit(ctx, 1, 0x10001);	/* ffffffff GRIDDIM_XY */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0x10001);	/* ffffffff BLOCKDIM_XY */
+	xf_emit(ctx, 1, 1);		/* 0000ffff BLOCKDIM_Z */
+	xf_emit(ctx, 1, 0x10001);	/* 00ffffff BLOCK_ALLOC */
+	xf_emit(ctx, 1, 1);		/* 00000001 LANES32 */
+	xf_emit(ctx, 1, 4);		/* 000000ff FP_REG_ALLOC_TEMP */
+	xf_emit(ctx, 1, 2);		/* 00000003 REG_MODE */
+	/* SEEK */
+	xf_emit(ctx, 0x40, 0);		/* ffffffff USER_PARAM */
+	switch (dev_priv->chipset) {
+	case 0x50:
+	case 0x92:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0x80, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 0x10*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0x84:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0x60, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 0xc*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0x94:
+	case 0x96:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0x40, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 8*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0x86:
+	case 0x98:
+		xf_emit(ctx, 4, 0);	/* f, 0, 0, 0 */
+		xf_emit(ctx, 0x10, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 2*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0xa0:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0xf0, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 0x1e*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0xa3:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0x60, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 0xc*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0xa5:
+	case 0xaf:
+		xf_emit(ctx, 8, 0);	/* 7, 0, 0, 0, ... */
+		xf_emit(ctx, 0x30, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 6*2, 0);	/* ffffffff, 1f */
+		break;
+	case 0xaa:
+		xf_emit(ctx, 0x12, 0);
+		break;
+	case 0xa8:
+	case 0xac:
+		xf_emit(ctx, 4, 0);	/* f, 0, 0, 0 */
+		xf_emit(ctx, 0x10, 0);	/* fff */
+		xf_emit(ctx, 2, 0);	/* ff, fff */
+		xf_emit(ctx, 2*2, 0);	/* ffffffff, 1f */
+		break;
+	}
+	xf_emit(ctx, 1, 0);		/* 0000000f */
+	xf_emit(ctx, 1, 0);		/* 00000000 */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 0000001f */
+	xf_emit(ctx, 4, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 00000003 turing UNK35C */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 4, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 00000003 turing UNK35C */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 000000ff */
 }
 
 static void
-nv50_graph_construct_gene_unk9(struct nouveau_grctx *ctx)
+nv50_graph_construct_gene_unk1cxx(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	/* middle of area 2 on pre-NVA0 [after m2mf], end of area 0 on NVAx */
-	xf_emit(ctx, 1, 0x3f800000);
-	xf_emit(ctx, 6, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0x1a);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x12, 0);
-	xf_emit(ctx, 1, 0x00ffff00);
-	xf_emit(ctx, 6, 0);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 0xf, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 2, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 3);
+	xf_emit(ctx, 2, 0);		/* 00007fff WINDOW_OFFSET_XY */
+	xf_emit(ctx, 1, 0x3f800000);	/* ffffffff LINE_WIDTH */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1658 */
+	xf_emit(ctx, 1, 0);		/* 00000001 POLYGON_SMOOTH_ENABLE */
+	xf_emit(ctx, 3, 0);		/* 00000001 POLYGON_OFFSET_*_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 0000000f CULL_MODE */
+	xf_emit(ctx, 1, 0x1a);		/* 0000001f POLYGON_MODE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 0);		/* 00000001 POINT_SPRITE_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK165C */
+	xf_emit(ctx, 0x10, 0);		/* 00000001 SCISSOR_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0x00ffff00);	/* 00ffffff LINE_STIPPLE_PATTERN */
+	xf_emit(ctx, 1, 0);		/* ffffffff POLYGON_OFFSET_UNITS */
+	xf_emit(ctx, 1, 0);		/* ffffffff POLYGON_OFFSET_FACTOR */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK1668 */
+	xf_emit(ctx, 2, 0);		/* 07ffffff SCREEN_SCISSOR */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0x11);		/* 0000007f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 0000007f RT_FORMAT */
+	xf_emit(ctx, 8, 0);		/* 00000001 RT_HORIZ_LINEAR */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000001 ALPHA_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 ALPHA_TEST_FUNC */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 3);	/* 00000003 UNK16B4 */
 	else if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 0x04000000);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 5);
-	xf_emit(ctx, 1, 0x52);
-	if (dev_priv->chipset == 0x50) {
-		xf_emit(ctx, 0x13, 0);
-	} else {
-		xf_emit(ctx, 4, 0);
-		xf_emit(ctx, 1, 1);
-		if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-			xf_emit(ctx, 0x11, 0);
-		else
-			xf_emit(ctx, 0x10, 0);
+		xf_emit(ctx, 1, 1);	/* 00000001 UNK16B4 */
+	xf_emit(ctx, 1, 0);		/* 00000003 MULTISAMPLE_CTRL */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK0F90 */
+	xf_emit(ctx, 1, 2);		/* 00000003 tesla UNK143C */
+	xf_emit(ctx, 2, 0x04000000);	/* 07ffffff tesla UNK0D6C */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 5);		/* 0000000f UNK1408 */
+	xf_emit(ctx, 1, 0x52);		/* 000001ff SEMANTIC_PTSZ */
+	xf_emit(ctx, 1, 0);		/* ffffffff POINT_SIZE */
+	xf_emit(ctx, 1, 0);		/* 00000001 */
+	xf_emit(ctx, 1, 0);		/* 00000007 tesla UNK0FB4 */
+	if (dev_priv->chipset != 0x50) {
+		xf_emit(ctx, 1, 0);	/* 3ff */
+		xf_emit(ctx, 1, 1);	/* 00000001 tesla UNK1110 */
 	}
-	xf_emit(ctx, 0x10, 0x3f800000);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 0x26, 0);
-	xf_emit(ctx, 1, 0x8100c12);
-	xf_emit(ctx, 1, 5);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 4, 0xffff);
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 0);	/* 00000003 tesla UNK1928 */
+	xf_emit(ctx, 0x10, 0);		/* ffffffff DEPTH_RANGE_NEAR */
+	xf_emit(ctx, 0x10, 0x3f800000);	/* ffffffff DEPTH_RANGE_FAR */
+	xf_emit(ctx, 1, 0x10);		/* 000000ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 0x20, 0);		/* 07ffffff VIEWPORT_HORIZ, then VIEWPORT_VERT. (W&0x3fff)<<13 | (X&0x1fff). */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK187C */
+	xf_emit(ctx, 1, 0);		/* 00000003 WINDOW_ORIGIN */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_MASK */
+	xf_emit(ctx, 1, 0x8100c12);	/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 5);		/* 0000000f tesla UNK1220 */
+	xf_emit(ctx, 1, 0);		/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 000000ff tesla UNK1A20 */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 VERTEX_TWO_SIDE_ENABLE */
+	xf_emit(ctx, 4, 0xffff);	/* 0000ffff MSAA_MASK */
 	if (dev_priv->chipset != 0x50)
-		xf_emit(ctx, 1, 3);
+		xf_emit(ctx, 1, 3);	/* 00000003 tesla UNK1100 */
 	if (dev_priv->chipset < 0xa0)
-		xf_emit(ctx, 0x1f, 0);
-	else if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0xc, 0);
-	else
-		xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x00ffff00);
-	xf_emit(ctx, 1, 0x1a);
+		xf_emit(ctx, 0x1c, 0);	/* RO */
+	else if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0x9, 0);
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0x00ffff00);	/* 00ffffff LINE_STIPPLE_PATTERN */
+	xf_emit(ctx, 1, 0x1a);		/* 0000001f POLYGON_MODE */
+	xf_emit(ctx, 1, 0);		/* 00000003 WINDOW_ORIGIN */
 	if (dev_priv->chipset != 0x50) {
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 1, 3);
+		xf_emit(ctx, 1, 3);	/* 00000003 tesla UNK1100 */
+		xf_emit(ctx, 1, 0);	/* 3ff */
 	}
+	/* XXX: the following block could belong either to unk1cxx, or
+	 * to STRMOUT. Rather hard to tell. */
 	if (dev_priv->chipset < 0xa0)
-		xf_emit(ctx, 0x26, 0);
+		xf_emit(ctx, 0x25, 0);
 	else
-		xf_emit(ctx, 0x3c, 0);
-	xf_emit(ctx, 1, 0x102);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 4, 4);
-	if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 0x3b, 0);
+}
+
+static void
+nv50_graph_construct_gene_strmout(struct nouveau_grctx *ctx)
+{
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	xf_emit(ctx, 1, 0x102);		/* 0000ffff STRMOUT_BUFFER_CTRL */
+	xf_emit(ctx, 1, 0);		/* ffffffff STRMOUT_PRIMITIVE_COUNT */
+	xf_emit(ctx, 4, 4);		/* 000000ff STRMOUT_NUM_ATTRIBS */
+	if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 4, 0);	/* ffffffff UNK1A8C */
+		xf_emit(ctx, 4, 0);	/* ffffffff UNK1780 */
+	}
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 4);		/* 0000007f VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
 	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0x3ff);
+		xf_emit(ctx, 1, 0x3ff);	/* 000003ff tesla UNK0D68 */
 	else
-		xf_emit(ctx, 1, 0x7ff);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x102);
-	xf_emit(ctx, 9, 0);
-	xf_emit(ctx, 4, 4);
-	xf_emit(ctx, 0x2c, 0);
+		xf_emit(ctx, 1, 0x7ff);	/* 000007ff tesla UNK0D68 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	/* SEEK */
+	xf_emit(ctx, 1, 0x102);		/* 0000ffff STRMOUT_BUFFER_CTRL */
+	xf_emit(ctx, 1, 0);		/* ffffffff STRMOUT_PRIMITIVE_COUNT */
+	xf_emit(ctx, 4, 0);		/* 000000ff STRMOUT_ADDRESS_HIGH */
+	xf_emit(ctx, 4, 0);		/* ffffffff STRMOUT_ADDRESS_LOW */
+	xf_emit(ctx, 4, 4);		/* 000000ff STRMOUT_NUM_ATTRIBS */
+	if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 4, 0);	/* ffffffff UNK1A8C */
+		xf_emit(ctx, 4, 0);	/* ffffffff UNK1780 */
+	}
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_STRMOUT */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_QUERY */
+	xf_emit(ctx, 1, 0);		/* 000000ff QUERY_ADDRESS_HIGH */
+	xf_emit(ctx, 2, 0);		/* ffffffff QUERY_ADDRESS_LOW QUERY_COUNTER */
+	xf_emit(ctx, 2, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	/* SEEK */
+	xf_emit(ctx, 0x20, 0);		/* ffffffff STRMOUT_MAP */
+	xf_emit(ctx, 1, 0);		/* 0000000f */
+	xf_emit(ctx, 1, 0);		/* 00000000? */
+	xf_emit(ctx, 2, 0);		/* ffffffff */
+}
+
+static void
+nv50_graph_construct_gene_ropm1(struct nouveau_grctx *ctx)
+{
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	xf_emit(ctx, 1, 0x4e3bfdf);	/* ffffffff UNK0D64 */
+	xf_emit(ctx, 1, 0x4e3bfdf);	/* ffffffff UNK0DF4 */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0);		/* 000003ff */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 0x11);	/* 000000ff tesla UNK1968 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+}
+
+static void
+nv50_graph_construct_gene_ropm2(struct nouveau_grctx *ctx)
+{
+	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_QUERY */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 2, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 000000ff QUERY_ADDRESS_HIGH */
+	xf_emit(ctx, 2, 0);		/* ffffffff QUERY_ADDRESS_LOW, COUNTER */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 7 */
+	/* SEEK */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_QUERY */
+	xf_emit(ctx, 1, 0);		/* 000000ff QUERY_ADDRESS_HIGH */
+	xf_emit(ctx, 2, 0);		/* ffffffff QUERY_ADDRESS_LOW, COUNTER */
+	xf_emit(ctx, 1, 0x4e3bfdf);	/* ffffffff UNK0D64 */
+	xf_emit(ctx, 1, 0x4e3bfdf);	/* ffffffff UNK0DF4 */
+	xf_emit(ctx, 1, 0);		/* 00000001 eng2d UNK260 */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 0x11);	/* 000000ff tesla UNK1968 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
 }
 
 static void
@@ -1749,443 +2392,709 @@ nv50_graph_construct_gene_ropc(struct nouveau_grctx *ctx)
 	int magic2;
 	if (dev_priv->chipset == 0x50) {
 		magic2 = 0x00003e60;
-	} else if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa) {
+	} else if (!IS_NVA3F(dev_priv->chipset)) {
 		magic2 = 0x001ffe67;
 	} else {
 		magic2 = 0x00087e67;
 	}
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, magic2);
-	xf_emit(ctx, 4, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 7, 0);
-	if (dev_priv->chipset >= 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 0x15);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 4, 0);
+	xf_emit(ctx, 1, 0);		/* f/7 MUTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_BACK_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_BACK_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 2);		/* 00000003 tesla UNK143C */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, magic2);	/* 001fffff tesla UNK0F78 */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000007 DEPTH_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_FRONT_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_FRONT_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	if (dev_priv->chipset >= 0xa0 && !IS_NVAAF(dev_priv->chipset))
+		xf_emit(ctx, 1, 0x15);	/* 000000ff */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 0x10);		/* 3ff/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);		/* ffffffff CLEAR_DEPTH */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
 	if (dev_priv->chipset == 0x86 || dev_priv->chipset == 0x92 || dev_priv->chipset == 0x98 || dev_priv->chipset >= 0xa0) {
-		xf_emit(ctx, 1, 4);
-		xf_emit(ctx, 1, 0x400);
-		xf_emit(ctx, 1, 0x300);
-		xf_emit(ctx, 1, 0x1001);
+		xf_emit(ctx, 3, 0);	/* ff, ffffffff, ffffffff */
+		xf_emit(ctx, 1, 4);	/* 7 */
+		xf_emit(ctx, 1, 0x400);	/* fffffff */
+		xf_emit(ctx, 1, 0x300);	/* ffff */
+		xf_emit(ctx, 1, 0x1001);	/* 1fff */
 		if (dev_priv->chipset != 0xa0) {
-			if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-				xf_emit(ctx, 1, 0);
+			if (IS_NVA3F(dev_priv->chipset))
+				xf_emit(ctx, 1, 0);	/* 0000000f UNK15C8 */
 			else
-				xf_emit(ctx, 1, 0x15);
+				xf_emit(ctx, 1, 0x15);	/* ff */
 		}
-		xf_emit(ctx, 3, 0);
 	}
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x13, 0);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 0x10, 0);
-	xf_emit(ctx, 0x10, 0x3f800000);
-	xf_emit(ctx, 0x19, 0);
-	xf_emit(ctx, 1, 0x10);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x3f);
-	xf_emit(ctx, 6, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
+	xf_emit(ctx, 1, 0);		/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_BACK_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 2);		/* 00000003 tesla UNK143C */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000007 DEPTH_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_FRONT_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 0x10);		/* 7f/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1900 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_BACK_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_REF */
+	xf_emit(ctx, 2, 0);		/* ffffffff DEPTH_BOUNDS */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000007 DEPTH_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK0FB0 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_FRONT_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_REF */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 0x10);		/* 7f/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 0x10, 0);		/* ffffffff DEPTH_RANGE_NEAR */
+	xf_emit(ctx, 0x10, 0x3f800000);	/* ffffffff DEPTH_RANGE_FAR */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 0);		/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_BACK_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_FUNC_REF */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_BACK_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 2, 0);		/* ffffffff DEPTH_BOUNDS */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000007 DEPTH_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 000000ff CLEAR_STENCIL */
+	xf_emit(ctx, 1, 0);		/* 00000007 STENCIL_FRONT_FUNC_FUNC */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_FUNC_REF */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_FRONT_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 1, 0x10);		/* 7f/ff VIEW_VOLUME_CLIP_CTRL */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 0x3f);		/* 0000003f UNK1590 */
+	xf_emit(ctx, 1, 0);		/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 2, 0);		/* ffff0ff3, ffff */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK0FB0 */
+	xf_emit(ctx, 1, 0);		/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff CLEAR_DEPTH */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK19CC */
 	if (dev_priv->chipset >= 0xa0) {
 		xf_emit(ctx, 2, 0);
 		xf_emit(ctx, 1, 0x1001);
 		xf_emit(ctx, 0xb, 0);
 	} else {
-		xf_emit(ctx, 0xc, 0);
+		xf_emit(ctx, 1, 0);	/* 00000007 */
+		xf_emit(ctx, 1, 0);	/* 00000001 tesla UNK1534 */
+		xf_emit(ctx, 1, 0);	/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+		xf_emit(ctx, 8, 0);	/* 00000001 BLEND_ENABLE */
+		xf_emit(ctx, 1, 0);	/* ffff0ff3 */
 	}
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x11);
-	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 4, 0);
-	else
-		xf_emit(ctx, 6, 0);
-	xf_emit(ctx, 3, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, magic2);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 0x18, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 8, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 8, 1);
-		xf_emit(ctx, 3, 0);
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 5, 0);
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 0x16, 0);
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f */
+	xf_emit(ctx, 1, 0);		/* 00000001 LOGIC_OP_ENABLE */
+	if (dev_priv->chipset != 0x50) {
+		xf_emit(ctx, 1, 0);	/* 0000000f LOGIC_OP */
+		xf_emit(ctx, 1, 0);	/* 000000ff */
+	}
+	xf_emit(ctx, 1, 0);		/* 00000007 OPERATION */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 0);		/* 00000003 UNK0F90 */
+	xf_emit(ctx, 2, 1);		/* 00000007 BLEND_EQUATION_RGB, ALPHA */
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK133C */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_RGB */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_RGB */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_ALPHA */
+	xf_emit(ctx, 1, 0);		/* 00000001 */
+	xf_emit(ctx, 1, magic2);	/* 001fffff tesla UNK0F78 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 0);	/* 00000001 tesla UNK12E4 */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_RGB */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_ALPHA */
+		xf_emit(ctx, 8, 1);	/* 00000001 IBLEND_UNK00 */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_FUNC_SRC_RGB */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_FUNC_DST_RGB */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_FUNC_SRC_ALPHA */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_FUNC_DST_ALPHA */
+		xf_emit(ctx, 1, 0);	/* 00000001 tesla UNK1140 */
+		xf_emit(ctx, 2, 0);	/* 00000001 */
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+		xf_emit(ctx, 1, 0);	/* 0000000f */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* ffffffff */
+		xf_emit(ctx, 2, 0);	/* 00000001 */
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+		xf_emit(ctx, 1, 0);	/* 00000001 */
+		xf_emit(ctx, 1, 0);	/* 000003ff */
+	} else if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 2, 0);	/* 00000001 */
+		xf_emit(ctx, 1, 0);	/* 00000007 */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* ffffffff */
+		xf_emit(ctx, 2, 0);	/* 00000001 */
 	} else {
-		if (dev_priv->chipset >= 0xa0)
-			xf_emit(ctx, 0x1b, 0);
-		else
-			xf_emit(ctx, 0x15, 0);
+		xf_emit(ctx, 1, 0);	/* 00000007 MULTISAMPLE_SAMPLES_LOG2 */
+		xf_emit(ctx, 1, 0);	/* 00000003 tesla UNK1430 */
+		xf_emit(ctx, 1, 0);	/* ffffffff tesla UNK1A3C */
 	}
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 1);
+	xf_emit(ctx, 4, 0);		/* ffffffff CLEAR_COLOR */
+	xf_emit(ctx, 4, 0);		/* ffffffff BLEND_COLOR A R G B */
+	xf_emit(ctx, 1, 0);		/* 00000fff eng2d UNK2B0 */
 	if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 4, 0);
-	else
-		xf_emit(ctx, 3, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 0x10, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 0x10, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 8, 1);
-		xf_emit(ctx, 3, 0);
+		xf_emit(ctx, 2, 0);	/* 00000001 */
+	xf_emit(ctx, 1, 0);		/* 000003ff */
+	xf_emit(ctx, 8, 0);		/* 00000001 BLEND_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK133C */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_RGB */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_RGB */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_RGB */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_ALPHA */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK19C0 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LOGIC_OP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f LOGIC_OP */
+	if (dev_priv->chipset >= 0xa0)
+		xf_emit(ctx, 1, 0);	/* 00000001 UNK12E4? NVA3+ only? */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 8, 1);	/* 00000001 IBLEND_UNK00 */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_RGB */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_FUNC_SRC_RGB */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_FUNC_DST_RGB */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_ALPHA */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_FUNC_SRC_ALPHA */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_FUNC_DST_ALPHA */
+		xf_emit(ctx, 1, 0);	/* 00000001 tesla UNK15C4 */
+		xf_emit(ctx, 1, 0);	/* 00000001 */
+		xf_emit(ctx, 1, 0);	/* 00000001 tesla UNK1140 */
 	}
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x5b, 0);
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	xf_emit(ctx, 1, 0);		/* 00000007 PATTERN_COLOR_FORMAT */
+	xf_emit(ctx, 2, 0);		/* ffffffff PATTERN_MONO_COLOR */
+	xf_emit(ctx, 1, 0);		/* 00000001 PATTERN_MONO_FORMAT */
+	xf_emit(ctx, 2, 0);		/* ffffffff PATTERN_MONO_BITMAP */
+	xf_emit(ctx, 1, 0);		/* 00000003 PATTERN_SELECT */
+	xf_emit(ctx, 1, 0);		/* 000000ff ROP */
+	xf_emit(ctx, 1, 0);		/* ffffffff BETA1 */
+	xf_emit(ctx, 1, 0);		/* ffffffff BETA4 */
+	xf_emit(ctx, 1, 0);		/* 00000007 OPERATION */
+	xf_emit(ctx, 0x50, 0);		/* 10x ffffff, ffffff, ffffff, ffffff, 3 PATTERN */
 }
 
 static void
-nv50_graph_construct_xfer_tp_x1(struct nouveau_grctx *ctx)
+nv50_graph_construct_xfer_unk84xx(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
 	int magic3;
-	if (dev_priv->chipset == 0x50)
+	switch (dev_priv->chipset) {
+	case 0x50:
 		magic3 = 0x1000;
-	else if (dev_priv->chipset == 0x86 || dev_priv->chipset == 0x98 || dev_priv->chipset >= 0xa8)
+		break;
+	case 0x86:
+	case 0x98:
+	case 0xa8:
+	case 0xaa:
+	case 0xac:
+	case 0xaf:
 		magic3 = 0x1e00;
-	else
+		break;
+	default:
 		magic3 = 0;
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 4);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0x24, 0);
+	}
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 7f/ff[NVA0+] VP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 111/113[NVA0+] */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0x1f, 0);	/* ffffffff */
 	else if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 0x14, 0);
+		xf_emit(ctx, 0x0f, 0);	/* ffffffff */
 	else
-		xf_emit(ctx, 0x15, 0);
-	xf_emit(ctx, 2, 4);
+		xf_emit(ctx, 0x10, 0);	/* fffffff VP_RESULT_MAP_1 up */
+	xf_emit(ctx, 2, 0);		/* f/1f[NVA3], fffffff/ffffffff[NVA0+] */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_RESULT_MAP_SIZE */
 	if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 1, 0x03020100);
+		xf_emit(ctx, 1, 0x03020100);	/* ffffffff */
 	else
-		xf_emit(ctx, 1, 0x00608080);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 2, 4);
-	xf_emit(ctx, 1, 0x80);
+		xf_emit(ctx, 1, 0x00608080);	/* fffffff VP_RESULT_MAP_0 */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 2, 0);		/* 111/113, 7f/ff */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0x80);		/* 0000ffff GP_VERTEX_OUTPUT_COUNT */
 	if (magic3)
-		xf_emit(ctx, 1, magic3);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 0x24, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0x80);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0x03020100);
-	xf_emit(ctx, 1, 3);
+		xf_emit(ctx, 1, magic3);	/* 00007fff tesla UNK141C */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 111/113 */
+	xf_emit(ctx, 0x1f, 0);		/* ffffffff GP_RESULT_MAP_1 up */
+	xf_emit(ctx, 1, 0);		/* 0000001f */
+	xf_emit(ctx, 1, 0);		/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 0x80);		/* 0000ffff GP_VERTEX_OUTPUT_COUNT */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0x03020100);	/* ffffffff GP_RESULT_MAP_0 */
+	xf_emit(ctx, 1, 3);		/* 00000003 GP_OUTPUT_PRIMITIVE_TYPE */
 	if (magic3)
-		xf_emit(ctx, 1, magic3);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 4);
+		xf_emit(ctx, 1, magic3);	/* 7fff tesla UNK141C */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 0);		/* 00000001 PROVOKING_VERTEX_LAST */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 111/113 */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 3);		/* 00000003 GP_OUTPUT_PRIMITIVE_TYPE */
+	xf_emit(ctx, 1, 0);		/* 00000001 PROVOKING_VERTEX_LAST */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK13A0 */
+	xf_emit(ctx, 1, 4);		/* 7f/ff VP_REG_ALLOC_RESULT */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+	xf_emit(ctx, 1, 0);		/* 111/113 */
 	if (dev_priv->chipset == 0x94 || dev_priv->chipset == 0x96)
-		xf_emit(ctx, 0x1024, 0);
+		xf_emit(ctx, 0x1020, 0);	/* 4 x (0x400 x 0xffffffff, ff, 0, 0, 0, 4 x ffffffff) */
 	else if (dev_priv->chipset < 0xa0)
-		xf_emit(ctx, 0xa24, 0);
-	else if (dev_priv->chipset == 0xa0 || dev_priv->chipset >= 0xaa)
-		xf_emit(ctx, 0x214, 0);
+		xf_emit(ctx, 0xa20, 0);	/* 4 x (0x280 x 0xffffffff, ff, 0, 0, 0, 4 x ffffffff) */
+	else if (!IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 0x210, 0);	/* ffffffff */
 	else
-		xf_emit(ctx, 0x414, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 2, 0);
+		xf_emit(ctx, 0x410, 0);	/* ffffffff */
+	xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+	xf_emit(ctx, 1, 4);		/* 000000ff GP_RESULT_MAP_SIZE */
+	xf_emit(ctx, 1, 3);		/* 00000003 GP_OUTPUT_PRIMITIVE_TYPE */
+	xf_emit(ctx, 1, 0);		/* 00000001 PROVOKING_VERTEX_LAST */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
 }
 
 static void
-nv50_graph_construct_xfer_tp_x2(struct nouveau_grctx *ctx)
+nv50_graph_construct_xfer_tprop(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
 	int magic1, magic2;
 	if (dev_priv->chipset == 0x50) {
 		magic1 = 0x3ff;
 		magic2 = 0x00003e60;
-	} else if (dev_priv->chipset <= 0xa0 || dev_priv->chipset >= 0xaa) {
+	} else if (!IS_NVA3F(dev_priv->chipset)) {
 		magic1 = 0x7ff;
 		magic2 = 0x001ffe67;
 	} else {
 		magic1 = 0x7ff;
 		magic2 = 0x00087e67;
 	}
-	xf_emit(ctx, 3, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0xc, 0);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 0xb, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 4, 0xffff);
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 5, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 1, 3);
-		xf_emit(ctx, 1, 0);
-	} else if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0xa, 0);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 0x18, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 8, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 8, 1);
-		xf_emit(ctx, 1, 0);
+	xf_emit(ctx, 1, 0);		/* 00000007 ALPHA_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* ffffffff ALPHA_TEST_REF */
+	xf_emit(ctx, 1, 0);		/* 00000001 ALPHA_TEST_ENABLE */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000000f UNK16A0 */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_BACK_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_BACK_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 4, 0);		/* ffffffff BLEND_COLOR */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK19C0 */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK0FDC */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 LOGIC_OP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ff[NV50]/3ff[NV84+] */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 4, 0xffff);	/* 0000ffff MSAA_MASK */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 3, 0);		/* 00000007 STENCIL_FRONT_OP_FAIL, ZFAIL, ZPASS */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_BACK_ENABLE */
+	xf_emit(ctx, 2, 0);		/* 00007fff WINDOW_OFFSET_XY */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK19CC */
+	xf_emit(ctx, 1, 0);		/* 7 */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff COLOR_KEY */
+	xf_emit(ctx, 1, 0);		/* 00000001 COLOR_KEY_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 COLOR_KEY_FORMAT */
+	xf_emit(ctx, 2, 0);		/* ffffffff SIFC_BITMAP_COLOR */
+	xf_emit(ctx, 1, 1);		/* 00000001 SIFC_BITMAP_WRITE_BIT0_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 ALPHA_TEST_FUNC */
+	xf_emit(ctx, 1, 0);		/* 00000001 ALPHA_TEST_ENABLE */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 3);	/* 00000003 tesla UNK16B4 */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* 00000003 tesla UNK1298 */
+	} else if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 1, 1);	/* 00000001 tesla UNK16B4 */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+	} else {
+		xf_emit(ctx, 1, 0);	/* 00000003 MULTISAMPLE_CTRL */
 	}
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 3, 0xcf);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0xa, 0);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 8, 1);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, magic2);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x11);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 2, 1);
-	else
-		xf_emit(ctx, 1, 1);
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 8, 0);		/* 00000001 BLEND_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_ALPHA */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_RGB */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_RGB */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_RGB */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 0);	/* 00000001 UNK12E4 */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_RGB */
+		xf_emit(ctx, 8, 1);	/* 00000007 IBLEND_EQUATION_ALPHA */
+		xf_emit(ctx, 8, 1);	/* 00000001 IBLEND_UNK00 */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_SRC_RGB */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_DST_RGB */
+		xf_emit(ctx, 8, 2);	/* 0000001f IBLEND_SRC_ALPHA */
+		xf_emit(ctx, 8, 1);	/* 0000001f IBLEND_DST_ALPHA */
+		xf_emit(ctx, 1, 0);	/* 00000001 UNK1140 */
+	}
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK133C */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000001 LOGIC_OP_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000003 UNK0F90 */
+	xf_emit(ctx, 1, 0);		/* 00000001 FRAMEBUFFER_SRGB */
+	xf_emit(ctx, 1, 0);		/* 7 */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	xf_emit(ctx, 1, 0);		/* 00000007 OPERATION */
+	xf_emit(ctx, 1, 0xcf);		/* 000000ff SIFC_FORMAT */
+	xf_emit(ctx, 1, 0xcf);		/* 000000ff DRAW_COLOR_FORMAT */
+	xf_emit(ctx, 1, 0xcf);		/* 000000ff SRC_FORMAT */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	xf_emit(ctx, 1, 0);		/* 7/f[NVA3] MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 8, 0);		/* 00000001 BLEND_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_ALPHA */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_ALPHA */
+	xf_emit(ctx, 1, 1);		/* 0000001f BLEND_FUNC_DST_RGB */
+	xf_emit(ctx, 1, 1);		/* 00000007 BLEND_EQUATION_RGB */
+	xf_emit(ctx, 1, 2);		/* 0000001f BLEND_FUNC_SRC_RGB */
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK133C */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 8, 1);		/* 00000001 UNK19E0 */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, magic2);	/* 001fffff tesla UNK0F78 */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
 	if(dev_priv->chipset == 0x50)
-		xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 1, 0);	/* ff */
 	else
-		xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 5, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, magic1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 0x28, 0);
-	xf_emit(ctx, 8, 8);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 8, 0x400);
-	xf_emit(ctx, 8, 0x300);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x20);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 0x100);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x40);
-	xf_emit(ctx, 1, 0x100);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 4, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, magic2);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 9, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x400);
-	xf_emit(ctx, 1, 0x300);
-	xf_emit(ctx, 1, 0x1001);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 4, 0);
-	else
-		xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 1, 0xf);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 0x15, 0);
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 3, 0);
-	} else
-		xf_emit(ctx, 0x17, 0);
+		xf_emit(ctx, 3, 0);	/* 1, 7, 3ff */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000003 UNK0F90 */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_DU_DX_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff BLIT_DU_DX_INT */
+	xf_emit(ctx, 1, 0);		/* 000fffff BLIT_DV_DY_FRACT */
+	xf_emit(ctx, 1, 1);		/* 0001ffff BLIT_DV_DY_INT */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, magic1);	/* 3ff/7ff tesla UNK0D68 */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 8, 0);		/* 0000ffff DMA_COLOR */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_GLOBAL */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_LOCAL */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_STACK */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_DST */
+	xf_emit(ctx, 1, 0);		/* 7 */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 8, 0);		/* 000000ff RT_ADDRESS_HIGH */
+	xf_emit(ctx, 8, 0);		/* ffffffff RT_LAYER_STRIDE */
+	xf_emit(ctx, 8, 0);		/* ffffffff RT_ADDRESS_LOW */
+	xf_emit(ctx, 8, 8);		/* 0000007f RT_TILE_MODE */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 8, 0x400);		/* 0fffffff RT_HORIZ */
+	xf_emit(ctx, 8, 0x300);		/* 0000ffff RT_VERT */
+	xf_emit(ctx, 1, 1);		/* 00001fff RT_ARRAY_MODE */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, 0x20);		/* 00000fff DST_TILE_MODE */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 0x100);		/* 0001ffff DST_HEIGHT */
+	xf_emit(ctx, 1, 0);		/* 000007ff DST_LAYER */
+	xf_emit(ctx, 1, 1);		/* 00000001 DST_LINEAR */
+	xf_emit(ctx, 1, 0);		/* ffffffff DST_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0);		/* 000000ff DST_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0x40);		/* 0007ffff DST_PITCH */
+	xf_emit(ctx, 1, 0x100);		/* 0001ffff DST_WIDTH */
+	xf_emit(ctx, 1, 0);		/* 0000ffff */
+	xf_emit(ctx, 1, 3);		/* 00000003 tesla UNK15AC */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 0);		/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0);		/* 00000003 UNK0F90 */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 1, magic2);	/* 001fffff tesla UNK0F78 */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 2);		/* 00000003 tesla UNK143C */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_ZETA */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 2, 0);		/* ffff, ff/3ff */
+	xf_emit(ctx, 1, 0);		/* 0001ffff GP_BUILTIN_RESULT_EN */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 000000ff STENCIL_FRONT_MASK */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0);		/* ffffffff ZETA_LAYER_STRIDE */
+	xf_emit(ctx, 1, 0);		/* 000000ff ZETA_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);		/* ffffffff ZETA_ADDRESS_LOW */
+	xf_emit(ctx, 1, 4);		/* 00000007 ZETA_TILE_MODE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	xf_emit(ctx, 1, 0x400);		/* 0fffffff ZETA_HORIZ */
+	xf_emit(ctx, 1, 0x300);		/* 0000ffff ZETA_VERT */
+	xf_emit(ctx, 1, 0x1001);	/* 00001fff ZETA_ARRAY_MODE */
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 0);	/* 00000001 */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 7, 0);		/* 3f/7f RT_FORMAT */
+	xf_emit(ctx, 1, 0x0fac6881);	/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0xf);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 7, 0);		/* 0000000f COLOR_MASK */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 8, 0);		/* 00000001 BLEND_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000003 UNK0F90 */
+	xf_emit(ctx, 1, 0);		/* 00000001 FRAMEBUFFER_SRGB */
+	xf_emit(ctx, 1, 0);		/* 7 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LOGIC_OP_ENABLE */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 0);	/* 00000001 UNK1140 */
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	}
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK1534 */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
 	if (dev_priv->chipset >= 0xa0)
-		xf_emit(ctx, 1, 0x0fac6881);
-	xf_emit(ctx, 1, magic2);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 3, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 2, 1);
-	else
-		xf_emit(ctx, 1, 1);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 2, 0);
-	else if (dev_priv->chipset != 0x50)
-		xf_emit(ctx, 1, 0);
+		xf_emit(ctx, 1, 0x0fac6881);	/* fffffff */
+	xf_emit(ctx, 1, magic2);	/* 001fffff tesla UNK0F78 */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_BOUNDS_EN */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE_ENABLE */
+	xf_emit(ctx, 1, 0x11);		/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK0FB0 */
+	xf_emit(ctx, 1, 0);		/* ff/3ff */
+	xf_emit(ctx, 1, 4);		/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);		/* 00000001 STENCIL_FRONT_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK15B4 */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK19CC */
+	xf_emit(ctx, 1, 0);		/* 00000007 */
+	xf_emit(ctx, 1, 0);		/* 00000001 SAMPLECNT_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 0000000f ZETA_FORMAT */
+	xf_emit(ctx, 1, 1);		/* 00000001 ZETA_ENABLE */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+		xf_emit(ctx, 1, 0);	/* 0000000f tesla UNK15C8 */
+	}
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A3C */
+	if (dev_priv->chipset >= 0xa0) {
+		xf_emit(ctx, 3, 0);		/* 7/f, 1, ffff0ff3 */
+		xf_emit(ctx, 1, 0xfac6881);	/* fffffff */
+		xf_emit(ctx, 4, 0);		/* 1, 1, 1, 3ff */
+		xf_emit(ctx, 1, 4);		/* 7 */
+		xf_emit(ctx, 1, 0);		/* 1 */
+		xf_emit(ctx, 2, 1);		/* 1 */
+		xf_emit(ctx, 2, 0);		/* 7, f */
+		xf_emit(ctx, 1, 1);		/* 1 */
+		xf_emit(ctx, 1, 0);		/* 7/f */
+		if (IS_NVA3F(dev_priv->chipset))
+			xf_emit(ctx, 0x9, 0);	/* 1 */
+		else
+			xf_emit(ctx, 0x8, 0);	/* 1 */
+		xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+		xf_emit(ctx, 8, 1);		/* 1 */
+		xf_emit(ctx, 1, 0x11);		/* 7f */
+		xf_emit(ctx, 7, 0);		/* 7f */
+		xf_emit(ctx, 1, 0xfac6881);	/* fffffff */
+		xf_emit(ctx, 1, 0xf);		/* f */
+		xf_emit(ctx, 7, 0);		/* f */
+		xf_emit(ctx, 1, 0x11);		/* 7f */
+		xf_emit(ctx, 1, 1);		/* 1 */
+		xf_emit(ctx, 5, 0);		/* 1, 7, 3ff, 3, 7 */
+		if (IS_NVA3F(dev_priv->chipset)) {
+			xf_emit(ctx, 1, 0);	/* 00000001 UNK1140 */
+			xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+		}
+	}
 }
 
 static void
-nv50_graph_construct_xfer_tp_x3(struct nouveau_grctx *ctx)
+nv50_graph_construct_xfer_tex(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
+	xf_emit(ctx, 2, 0);		/* 1 LINKED_TSC. yes, 2. */
+	if (dev_priv->chipset != 0x50)
+		xf_emit(ctx, 1, 0);	/* 3 */
+	xf_emit(ctx, 1, 1);		/* 1ffff BLIT_DU_DX_INT */
+	xf_emit(ctx, 1, 0);		/* fffff BLIT_DU_DX_FRACT */
+	xf_emit(ctx, 1, 1);		/* 1ffff BLIT_DV_DY_INT */
+	xf_emit(ctx, 1, 0);		/* fffff BLIT_DV_DY_FRACT */
 	if (dev_priv->chipset == 0x50)
-		xf_emit(ctx, 2, 0);
+		xf_emit(ctx, 1, 0);	/* 3 BLIT_CONTROL */
 	else
-		xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0x2a712488);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x4085c000);
-	xf_emit(ctx, 1, 0x40);
-	xf_emit(ctx, 1, 0x100);
-	xf_emit(ctx, 1, 0x10100);
-	xf_emit(ctx, 1, 0x02800000);
+		xf_emit(ctx, 2, 0);	/* 3ff, 1 */
+	xf_emit(ctx, 1, 0x2a712488);	/* ffffffff SRC_TIC_0 */
+	xf_emit(ctx, 1, 0);		/* ffffffff SRC_TIC_1 */
+	xf_emit(ctx, 1, 0x4085c000);	/* ffffffff SRC_TIC_2 */
+	xf_emit(ctx, 1, 0x40);		/* ffffffff SRC_TIC_3 */
+	xf_emit(ctx, 1, 0x100);		/* ffffffff SRC_TIC_4 */
+	xf_emit(ctx, 1, 0x10100);	/* ffffffff SRC_TIC_5 */
+	xf_emit(ctx, 1, 0x02800000);	/* ffffffff SRC_TIC_6 */
+	xf_emit(ctx, 1, 0);		/* ffffffff SRC_TIC_7 */
+	if (dev_priv->chipset == 0x50) {
+		xf_emit(ctx, 1, 0);	/* 00000001 turing UNK358 */
+		xf_emit(ctx, 1, 0);	/* ffffffff tesla UNK1A34? */
+		xf_emit(ctx, 1, 0);	/* 00000003 turing UNK37C tesla UNK1690 */
+		xf_emit(ctx, 1, 0);	/* 00000003 BLIT_CONTROL */
+		xf_emit(ctx, 1, 0);	/* 00000001 turing UNK32C tesla UNK0F94 */
+	} else if (!IS_NVAAF(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 0);	/* ffffffff tesla UNK1A34? */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* 000003ff */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* 000003ff */
+		xf_emit(ctx, 1, 0);	/* 00000003 tesla UNK1664 / turing UNK03E8 */
+		xf_emit(ctx, 1, 0);	/* 00000003 */
+		xf_emit(ctx, 1, 0);	/* 000003ff */
+	} else {
+		xf_emit(ctx, 0x6, 0);
+	}
+	xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A34 */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_TEXTURE */
+	xf_emit(ctx, 1, 0);		/* 0000ffff DMA_SRC */
 }
 
 static void
-nv50_graph_construct_xfer_tp_x4(struct nouveau_grctx *ctx)
+nv50_graph_construct_xfer_unk8cxx(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	xf_emit(ctx, 2, 0x04e3bfdf);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x00ffff00);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 2, 1);
-	else
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 0x00ffff00);
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0x30201000);
-	xf_emit(ctx, 1, 0x70605040);
-	xf_emit(ctx, 1, 0xb8a89888);
-	xf_emit(ctx, 1, 0xf8e8d8c8);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x1a);
-}
-
-static void
-nv50_graph_construct_xfer_tp_x5(struct nouveau_grctx *ctx)
-{
-	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 0xfac6881);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 2, 0);
-	xf_emit(ctx, 1, 1);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 0xb, 0);
-	else
-		xf_emit(ctx, 0xa, 0);
-	xf_emit(ctx, 8, 1);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0xfac6881);
-	xf_emit(ctx, 1, 0xf);
-	xf_emit(ctx, 7, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 1);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 6, 0);
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 6, 0);
-	} else {
-		xf_emit(ctx, 0xb, 0);
-	}
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 2, 0);		/* 7, ffff0ff3 */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE */
+	xf_emit(ctx, 1, 0x04e3bfdf);	/* ffffffff UNK0D64 */
+	xf_emit(ctx, 1, 0x04e3bfdf);	/* ffffffff UNK0DF4 */
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK15B4 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0x00ffff00);	/* 00ffffff LINE_STIPPLE_PATTERN */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK0F98 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);	/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK1668 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_STIPPLE_ENABLE */
+	xf_emit(ctx, 1, 0x00ffff00);	/* 00ffffff LINE_STIPPLE_PATTERN */
+	xf_emit(ctx, 1, 0);		/* 00000001 POLYGON_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 UNK1534 */
+	xf_emit(ctx, 1, 0);		/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 1, 0);		/* 00000001 tesla UNK1658 */
+	xf_emit(ctx, 1, 0);		/* 00000001 LINE_SMOOTH_ENABLE */
+	xf_emit(ctx, 1, 0);		/* ffff0ff3 */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);		/* 00000001 DEPTH_WRITE */
+	xf_emit(ctx, 1, 1);		/* 00000001 UNK15B4 */
+	xf_emit(ctx, 1, 0);		/* 00000001 POINT_SPRITE_ENABLE */
+	xf_emit(ctx, 1, 1);		/* 00000001 tesla UNK165C */
+	xf_emit(ctx, 1, 0x30201000);	/* ffffffff tesla UNK1670 */
+	xf_emit(ctx, 1, 0x70605040);	/* ffffffff tesla UNK1670 */
+	xf_emit(ctx, 1, 0xb8a89888);	/* ffffffff tesla UNK1670 */
+	xf_emit(ctx, 1, 0xf8e8d8c8);	/* ffffffff tesla UNK1670 */
+	xf_emit(ctx, 1, 0);		/* 00000001 VERTEX_TWO_SIDE_ENABLE */
+	xf_emit(ctx, 1, 0x1a);		/* 0000001f POLYGON_MODE */
 }
 
 static void
@@ -2193,108 +3102,136 @@ nv50_graph_construct_xfer_tp(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
 	if (dev_priv->chipset < 0xa0) {
-		nv50_graph_construct_xfer_tp_x1(ctx);
-		nv50_graph_construct_xfer_tp_x2(ctx);
-		nv50_graph_construct_xfer_tp_x3(ctx);
-		if (dev_priv->chipset == 0x50)
-			xf_emit(ctx, 0xf, 0);
-		else
-			xf_emit(ctx, 0x12, 0);
-		nv50_graph_construct_xfer_tp_x4(ctx);
+		nv50_graph_construct_xfer_unk84xx(ctx);
+		nv50_graph_construct_xfer_tprop(ctx);
+		nv50_graph_construct_xfer_tex(ctx);
+		nv50_graph_construct_xfer_unk8cxx(ctx);
 	} else {
-		nv50_graph_construct_xfer_tp_x3(ctx);
-		if (dev_priv->chipset < 0xaa)
-			xf_emit(ctx, 0xc, 0);
-		else
-			xf_emit(ctx, 0xa, 0);
-		nv50_graph_construct_xfer_tp_x2(ctx);
-		nv50_graph_construct_xfer_tp_x5(ctx);
-		nv50_graph_construct_xfer_tp_x4(ctx);
-		nv50_graph_construct_xfer_tp_x1(ctx);
+		nv50_graph_construct_xfer_tex(ctx);
+		nv50_graph_construct_xfer_tprop(ctx);
+		nv50_graph_construct_xfer_unk8cxx(ctx);
+		nv50_graph_construct_xfer_unk84xx(ctx);
 	}
 }
 
 static void
-nv50_graph_construct_xfer_tp2(struct nouveau_grctx *ctx)
+nv50_graph_construct_xfer_mpc(struct nouveau_grctx *ctx)
 {
 	struct drm_nouveau_private *dev_priv = ctx->dev->dev_private;
-	int i, mpcnt;
-	if (dev_priv->chipset == 0x98 || dev_priv->chipset == 0xaa)
-		mpcnt = 1;
-	else if (dev_priv->chipset < 0xa0 || dev_priv->chipset >= 0xa8)
-		mpcnt = 2;
-	else
-		mpcnt = 3;
+	int i, mpcnt = 2;
+	switch (dev_priv->chipset) {
+		case 0x98:
+		case 0xaa:
+			mpcnt = 1;
+			break;
+		case 0x50:
+		case 0x84:
+		case 0x86:
+		case 0x92:
+		case 0x94:
+		case 0x96:
+		case 0xa8:
+		case 0xac:
+			mpcnt = 2;
+			break;
+		case 0xa0:
+		case 0xa3:
+		case 0xa5:
+		case 0xaf:
+			mpcnt = 3;
+			break;
+	}
 	for (i = 0; i < mpcnt; i++) {
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 1, 0x80);
-		xf_emit(ctx, 1, 0x80007004);
-		xf_emit(ctx, 1, 0x04000400);
+		xf_emit(ctx, 1, 0);		/* ff */
+		xf_emit(ctx, 1, 0x80);		/* ffffffff tesla UNK1404 */
+		xf_emit(ctx, 1, 0x80007004);	/* ffffffff tesla UNK12B0 */
+		xf_emit(ctx, 1, 0x04000400);	/* ffffffff */
 		if (dev_priv->chipset >= 0xa0)
-			xf_emit(ctx, 1, 0xc0);
-		xf_emit(ctx, 1, 0x1000);
-		xf_emit(ctx, 2, 0);
-		if (dev_priv->chipset == 0x86 || dev_priv->chipset == 0x98 || dev_priv->chipset >= 0xa8) {
-			xf_emit(ctx, 1, 0xe00);
-			xf_emit(ctx, 1, 0x1e00);
+			xf_emit(ctx, 1, 0xc0);	/* 00007fff tesla UNK152C */
+		xf_emit(ctx, 1, 0x1000);	/* 0000ffff tesla UNK0D60 */
+		xf_emit(ctx, 1, 0);		/* ff/3ff */
+		xf_emit(ctx, 1, 0);		/* ffffffff tesla UNK1A30 */
+		if (dev_priv->chipset == 0x86 || dev_priv->chipset == 0x98 || dev_priv->chipset == 0xa8 || IS_NVAAF(dev_priv->chipset)) {
+			xf_emit(ctx, 1, 0xe00);		/* 7fff */
+			xf_emit(ctx, 1, 0x1e00);	/* 7fff */
 		}
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 2, 0);
+		xf_emit(ctx, 1, 1);		/* 000000ff VP_REG_ALLOC_TEMP */
+		xf_emit(ctx, 1, 0);		/* 00000001 LINKED_TSC */
+		xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
 		if (dev_priv->chipset == 0x50)
-			xf_emit(ctx, 2, 0x1000);
-		xf_emit(ctx, 1, 1);
-		xf_emit(ctx, 1, 0);
-		xf_emit(ctx, 1, 4);
-		xf_emit(ctx, 1, 2);
-		if (dev_priv->chipset >= 0xaa)
-			xf_emit(ctx, 0xb, 0);
+			xf_emit(ctx, 2, 0x1000);	/* 7fff tesla UNK141C */
+		xf_emit(ctx, 1, 1);		/* 000000ff GP_REG_ALLOC_TEMP */
+		xf_emit(ctx, 1, 0);		/* 00000001 GP_ENABLE */
+		xf_emit(ctx, 1, 4);		/* 000000ff FP_REG_ALLOC_TEMP */
+		xf_emit(ctx, 1, 2);		/* 00000003 REG_MODE */
+		if (IS_NVAAF(dev_priv->chipset))
+			xf_emit(ctx, 0xb, 0);	/* RO */
 		else if (dev_priv->chipset >= 0xa0)
-			xf_emit(ctx, 0xc, 0);
+			xf_emit(ctx, 0xc, 0);	/* RO */
 		else
-			xf_emit(ctx, 0xa, 0);
+			xf_emit(ctx, 0xa, 0);	/* RO */
 	}
-	xf_emit(ctx, 1, 0x08100c12);
-	xf_emit(ctx, 1, 0);
+	xf_emit(ctx, 1, 0x08100c12);		/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 0);			/* ff/3ff */
 	if (dev_priv->chipset >= 0xa0) {
-		xf_emit(ctx, 1, 0x1fe21);
+		xf_emit(ctx, 1, 0x1fe21);	/* 0003ffff tesla UNK0FAC */
 	}
-	xf_emit(ctx, 5, 0);
-	xf_emit(ctx, 4, 0xffff);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 2, 0x10001);
-	xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 0x1fe21);
-	xf_emit(ctx, 1, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 1);
-	xf_emit(ctx, 4, 0);
-	xf_emit(ctx, 1, 0x08100c12);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 8, 0);
-	xf_emit(ctx, 1, 0xfac6881);
-	xf_emit(ctx, 1, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa)
-		xf_emit(ctx, 1, 3);
-	xf_emit(ctx, 3, 0);
-	xf_emit(ctx, 1, 4);
-	xf_emit(ctx, 9, 0);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 2, 1);
-	xf_emit(ctx, 1, 2);
-	xf_emit(ctx, 3, 1);
-	xf_emit(ctx, 1, 0);
-	if (dev_priv->chipset > 0xa0 && dev_priv->chipset < 0xaa) {
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 0x10, 1);
-		xf_emit(ctx, 8, 2);
-		xf_emit(ctx, 0x18, 1);
-		xf_emit(ctx, 3, 0);
+	xf_emit(ctx, 3, 0);			/* 7fff, 0, 0 */
+	xf_emit(ctx, 1, 0);			/* 00000001 tesla UNK1534 */
+	xf_emit(ctx, 1, 0);			/* 7/f MULTISAMPLE_SAMPLES_LOG2 */
+	xf_emit(ctx, 4, 0xffff);		/* 0000ffff MSAA_MASK */
+	xf_emit(ctx, 1, 1);			/* 00000001 LANES32 */
+	xf_emit(ctx, 1, 0x10001);		/* 00ffffff BLOCK_ALLOC */
+	xf_emit(ctx, 1, 0x10001);		/* ffffffff BLOCKDIM_XY */
+	xf_emit(ctx, 1, 1);			/* 0000ffff BLOCKDIM_Z */
+	xf_emit(ctx, 1, 0);			/* ffffffff SHARED_SIZE */
+	xf_emit(ctx, 1, 0x1fe21);		/* 1ffff/3ffff[NVA0+] tesla UNk0FAC */
+	xf_emit(ctx, 1, 0);			/* ffffffff tesla UNK1A34 */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 1);		/* 0000001f tesla UNK169C */
+	xf_emit(ctx, 1, 0);			/* ff/3ff */
+	xf_emit(ctx, 1, 0);			/* 1 LINKED_TSC */
+	xf_emit(ctx, 1, 0);			/* ff FP_ADDRESS_HIGH */
+	xf_emit(ctx, 1, 0);			/* ffffffff FP_ADDRESS_LOW */
+	xf_emit(ctx, 1, 0x08100c12);		/* 1fffffff FP_INTERPOLANT_CTRL */
+	xf_emit(ctx, 1, 4);			/* 00000007 FP_CONTROL */
+	xf_emit(ctx, 1, 0);			/* 000000ff FRAG_COLOR_CLAMP_EN */
+	xf_emit(ctx, 1, 2);			/* 00000003 REG_MODE */
+	xf_emit(ctx, 1, 0x11);			/* 0000007f RT_FORMAT */
+	xf_emit(ctx, 7, 0);			/* 0000007f RT_FORMAT */
+	xf_emit(ctx, 1, 0);			/* 00000007 */
+	xf_emit(ctx, 1, 0xfac6881);		/* 0fffffff RT_CONTROL */
+	xf_emit(ctx, 1, 0);			/* 00000003 MULTISAMPLE_CTRL */
+	if (IS_NVA3F(dev_priv->chipset))
+		xf_emit(ctx, 1, 3);		/* 00000003 tesla UNK16B4 */
+	xf_emit(ctx, 1, 0);			/* 00000001 ALPHA_TEST_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 00000007 ALPHA_TEST_FUNC */
+	xf_emit(ctx, 1, 0);			/* 00000001 FRAMEBUFFER_SRGB */
+	xf_emit(ctx, 1, 4);			/* ffffffff tesla UNK1400 */
+	xf_emit(ctx, 8, 0);			/* 00000001 BLEND_ENABLE */
+	xf_emit(ctx, 1, 0);			/* 00000001 LOGIC_OP_ENABLE */
+	xf_emit(ctx, 1, 2);			/* 0000001f BLEND_FUNC_SRC_RGB */
+	xf_emit(ctx, 1, 1);			/* 0000001f BLEND_FUNC_DST_RGB */
+	xf_emit(ctx, 1, 1);			/* 00000007 BLEND_EQUATION_RGB */
+	xf_emit(ctx, 1, 2);			/* 0000001f BLEND_FUNC_SRC_ALPHA */
+	xf_emit(ctx, 1, 1);			/* 0000001f BLEND_FUNC_DST_ALPHA */
+	xf_emit(ctx, 1, 1);			/* 00000007 BLEND_EQUATION_ALPHA */
+	xf_emit(ctx, 1, 1);			/* 00000001 UNK133C */
+	if (IS_NVA3F(dev_priv->chipset)) {
+		xf_emit(ctx, 1, 0);		/* 00000001 UNK12E4 */
+		xf_emit(ctx, 8, 2);		/* 0000001f IBLEND_FUNC_SRC_RGB */
+		xf_emit(ctx, 8, 1);		/* 0000001f IBLEND_FUNC_DST_RGB */
+		xf_emit(ctx, 8, 1);		/* 00000007 IBLEND_EQUATION_RGB */
+		xf_emit(ctx, 8, 2);		/* 0000001f IBLEND_FUNC_SRC_ALPHA */
+		xf_emit(ctx, 8, 1);		/* 0000001f IBLEND_FUNC_DST_ALPHA */
+		xf_emit(ctx, 8, 1);		/* 00000007 IBLEND_EQUATION_ALPHA */
+		xf_emit(ctx, 8, 1);		/* 00000001 IBLEND_UNK00 */
+		xf_emit(ctx, 1, 0);		/* 00000003 tesla UNK1928 */
+		xf_emit(ctx, 1, 0);		/* 00000001 UNK1140 */
 	}
-	xf_emit(ctx, 1, 4);
+	xf_emit(ctx, 1, 0);			/* 00000003 tesla UNK0F90 */
+	xf_emit(ctx, 1, 4);			/* 000000ff FP_RESULT_COUNT */
+	/* XXX: demagic this part some day */
 	if (dev_priv->chipset == 0x50)
 		xf_emit(ctx, 0x3a0, 0);
 	else if (dev_priv->chipset < 0x94)
@@ -2303,9 +3240,9 @@ nv50_graph_construct_xfer_tp2(struct nouveau_grctx *ctx)
 		xf_emit(ctx, 0x39f, 0);
 	else
 		xf_emit(ctx, 0x3a3, 0);
-	xf_emit(ctx, 1, 0x11);
-	xf_emit(ctx, 1, 0);
-	xf_emit(ctx, 1, 1);
+	xf_emit(ctx, 1, 0x11);			/* 3f/7f DST_FORMAT */
+	xf_emit(ctx, 1, 0);			/* 7 OPERATION */
+	xf_emit(ctx, 1, 1);			/* 1 DST_LINEAR */
 	xf_emit(ctx, 0x2d, 0);
 }
 
@@ -2323,52 +3260,56 @@ nv50_graph_construct_xfer2(struct nouveau_grctx *ctx)
 	if (dev_priv->chipset < 0xa0) {
 		for (i = 0; i < 8; i++) {
 			ctx->ctxvals_pos = offset + i;
+			/* that little bugger belongs to csched. No idea
+			 * what it's doing here. */
 			if (i == 0)
-				xf_emit(ctx, 1, 0x08100c12);
+				xf_emit(ctx, 1, 0x08100c12); /* FP_INTERPOLANT_CTRL */
 			if (units & (1 << i))
-				nv50_graph_construct_xfer_tp2(ctx);
+				nv50_graph_construct_xfer_mpc(ctx);
 			if ((ctx->ctxvals_pos-offset)/8 > size)
 				size = (ctx->ctxvals_pos-offset)/8;
 		}
 	} else {
 		/* Strand 0: TPs 0, 1 */
 		ctx->ctxvals_pos = offset;
-		xf_emit(ctx, 1, 0x08100c12);
+		/* that little bugger belongs to csched. No idea
+		 * what it's doing here. */
+		xf_emit(ctx, 1, 0x08100c12); /* FP_INTERPOLANT_CTRL */
 		if (units & (1 << 0))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 1))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
-		/* Strand 0: TPs 2, 3 */
+		/* Strand 1: TPs 2, 3 */
 		ctx->ctxvals_pos = offset + 1;
 		if (units & (1 << 2))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 3))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
-		/* Strand 0: TPs 4, 5, 6 */
+		/* Strand 2: TPs 4, 5, 6 */
 		ctx->ctxvals_pos = offset + 2;
 		if (units & (1 << 4))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 5))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 6))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 
-		/* Strand 0: TPs 7, 8, 9 */
+		/* Strand 3: TPs 7, 8, 9 */
 		ctx->ctxvals_pos = offset + 3;
 		if (units & (1 << 7))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 8))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if (units & (1 << 9))
-			nv50_graph_construct_xfer_tp2(ctx);
+			nv50_graph_construct_xfer_mpc(ctx);
 		if ((ctx->ctxvals_pos-offset)/8 > size)
 			size = (ctx->ctxvals_pos-offset)/8;
 	}
diff --git a/drivers/gpu/drm/nouveau/nv50_instmem.c b/drivers/gpu/drm/nouveau/nv50_instmem.c
index 5f21df3..ac3de05 100644
--- a/drivers/gpu/drm/nouveau/nv50_instmem.c
+++ b/drivers/gpu/drm/nouveau/nv50_instmem.c
@@ -32,41 +32,87 @@
 struct nv50_instmem_priv {
 	uint32_t save1700[5]; /* 0x1700->0x1710 */
 
-	struct nouveau_gpuobj_ref *pramin_pt;
-	struct nouveau_gpuobj_ref *pramin_bar;
-	struct nouveau_gpuobj_ref *fb_bar;
-
-	bool last_access_wr;
+	struct nouveau_gpuobj *pramin_pt;
+	struct nouveau_gpuobj *pramin_bar;
+	struct nouveau_gpuobj *fb_bar;
 };
 
-#define NV50_INSTMEM_PAGE_SHIFT 12
-#define NV50_INSTMEM_PAGE_SIZE  (1 << NV50_INSTMEM_PAGE_SHIFT)
-#define NV50_INSTMEM_PT_SIZE(a)	(((a) >> 12) << 3)
+static void
+nv50_channel_del(struct nouveau_channel **pchan)
+{
+	struct nouveau_channel *chan;
 
-/*NOTE: - Assumes 0x1700 already covers the correct MiB of PRAMIN
- */
-#define BAR0_WI32(g, o, v) do {                                   \
-	uint32_t offset;                                          \
-	if ((g)->im_backing) {                                    \
-		offset = (g)->im_backing_start;                   \
-	} else {                                                  \
-		offset  = chan->ramin->gpuobj->im_backing_start;  \
-		offset += (g)->im_pramin->start;                  \
-	}                                                         \
-	offset += (o);                                            \
-	nv_wr32(dev, NV_RAMIN + (offset & 0xfffff), (v));              \
-} while (0)
+	chan = *pchan;
+	*pchan = NULL;
+	if (!chan)
+		return;
+
+	nouveau_gpuobj_ref(NULL, &chan->ramfc);
+	nouveau_gpuobj_ref(NULL, &chan->vm_pd);
+	if (chan->ramin_heap.fl_entry.next)
+		drm_mm_takedown(&chan->ramin_heap);
+	nouveau_gpuobj_ref(NULL, &chan->ramin);
+	kfree(chan);
+}
+
+static int
+nv50_channel_new(struct drm_device *dev, u32 size,
+		 struct nouveau_channel **pchan)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 pgd = (dev_priv->chipset == 0x50) ? 0x1400 : 0x0200;
+	u32  fc = (dev_priv->chipset == 0x50) ? 0x0000 : 0x4200;
+	struct nouveau_channel *chan;
+	int ret;
+
+	chan = kzalloc(sizeof(*chan), GFP_KERNEL);
+	if (!chan)
+		return -ENOMEM;
+	chan->dev = dev;
+
+	ret = nouveau_gpuobj_new(dev, NULL, size, 0x1000, 0, &chan->ramin);
+	if (ret) {
+		nv50_channel_del(&chan);
+		return ret;
+	}
+
+	ret = drm_mm_init(&chan->ramin_heap, 0x6000, chan->ramin->size);
+	if (ret) {
+		nv50_channel_del(&chan);
+		return ret;
+	}
+
+	ret = nouveau_gpuobj_new_fake(dev, chan->ramin->pinst == ~0 ? ~0 :
+				      chan->ramin->pinst + pgd,
+				      chan->ramin->vinst + pgd,
+				      0x4000, NVOBJ_FLAG_ZERO_ALLOC,
+				      &chan->vm_pd);
+	if (ret) {
+		nv50_channel_del(&chan);
+		return ret;
+	}
+
+	ret = nouveau_gpuobj_new_fake(dev, chan->ramin->pinst == ~0 ? ~0 :
+				      chan->ramin->pinst + fc,
+				      chan->ramin->vinst + fc, 0x100,
+				      NVOBJ_FLAG_ZERO_ALLOC, &chan->ramfc);
+	if (ret) {
+		nv50_channel_del(&chan);
+		return ret;
+	}
+
+	*pchan = chan;
+	return 0;
+}
 
 int
 nv50_instmem_init(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nouveau_channel *chan;
-	uint32_t c_offset, c_size, c_ramfc, c_vmpd, c_base, pt_size;
-	uint32_t save_nv001700;
-	uint64_t v;
 	struct nv50_instmem_priv *priv;
+	struct nouveau_channel *chan;
 	int ret, i;
+	u32 tmp;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	if (!priv)
@@ -77,215 +123,115 @@ nv50_instmem_init(struct drm_device *dev)
 	for (i = 0x1700; i <= 0x1710; i += 4)
 		priv->save1700[(i-0x1700)/4] = nv_rd32(dev, i);
 
-	/* Reserve the last MiB of VRAM, we should probably try to avoid
-	 * setting up the below tables over the top of the VBIOS image at
-	 * some point.
-	 */
-	dev_priv->ramin_rsvd_vram = 1 << 20;
-	c_offset = dev_priv->vram_size - dev_priv->ramin_rsvd_vram;
-	c_size   = 128 << 10;
-	c_vmpd   = ((dev_priv->chipset & 0xf0) == 0x50) ? 0x1400 : 0x200;
-	c_ramfc  = ((dev_priv->chipset & 0xf0) == 0x50) ? 0x0 : 0x20;
-	c_base   = c_vmpd + 0x4000;
-	pt_size  = NV50_INSTMEM_PT_SIZE(dev_priv->ramin_size);
-
-	NV_DEBUG(dev, " Rsvd VRAM base: 0x%08x\n", c_offset);
-	NV_DEBUG(dev, "    VBIOS image: 0x%08x\n",
-				(nv_rd32(dev, 0x619f04) & ~0xff) << 8);
-	NV_DEBUG(dev, "  Aperture size: %d MiB\n", dev_priv->ramin_size >> 20);
-	NV_DEBUG(dev, "        PT size: %d KiB\n", pt_size >> 10);
-
-	/* Determine VM layout, we need to do this first to make sure
-	 * we allocate enough memory for all the page tables.
-	 */
-	dev_priv->vm_gart_base = roundup(NV50_VM_BLOCK, NV50_VM_BLOCK);
-	dev_priv->vm_gart_size = NV50_VM_BLOCK;
-
-	dev_priv->vm_vram_base = dev_priv->vm_gart_base + dev_priv->vm_gart_size;
-	dev_priv->vm_vram_size = dev_priv->vram_size;
-	if (dev_priv->vm_vram_size > NV50_VM_MAX_VRAM)
-		dev_priv->vm_vram_size = NV50_VM_MAX_VRAM;
-	dev_priv->vm_vram_size = roundup(dev_priv->vm_vram_size, NV50_VM_BLOCK);
-	dev_priv->vm_vram_pt_nr = dev_priv->vm_vram_size / NV50_VM_BLOCK;
-
-	dev_priv->vm_end = dev_priv->vm_vram_base + dev_priv->vm_vram_size;
-
-	NV_DEBUG(dev, "NV50VM: GART 0x%016llx-0x%016llx\n",
-		 dev_priv->vm_gart_base,
-		 dev_priv->vm_gart_base + dev_priv->vm_gart_size - 1);
-	NV_DEBUG(dev, "NV50VM: VRAM 0x%016llx-0x%016llx\n",
-		 dev_priv->vm_vram_base,
-		 dev_priv->vm_vram_base + dev_priv->vm_vram_size - 1);
-
-	c_size += dev_priv->vm_vram_pt_nr * (NV50_VM_BLOCK / 65536 * 8);
-
-	/* Map BAR0 PRAMIN aperture over the memory we want to use */
-	save_nv001700 = nv_rd32(dev, NV50_PUNK_BAR0_PRAMIN);
-	nv_wr32(dev, NV50_PUNK_BAR0_PRAMIN, (c_offset >> 16));
-
-	/* Create a fake channel, and use it as our "dummy" channels 0/127.
-	 * The main reason for creating a channel is so we can use the gpuobj
-	 * code.  However, it's probably worth noting that NVIDIA also setup
-	 * their channels 0/127 with the same values they configure here.
-	 * So, there may be some other reason for doing this.
-	 *
-	 * Have to create the entire channel manually, as the real channel
-	 * creation code assumes we have PRAMIN access, and we don't until
-	 * we're done here.
-	 */
-	chan = kzalloc(sizeof(*chan), GFP_KERNEL);
-	if (!chan)
+	/* Global PRAMIN heap */
+	ret = drm_mm_init(&dev_priv->ramin_heap, 0, dev_priv->ramin_size);
+	if (ret) {
+		NV_ERROR(dev, "Failed to init RAMIN heap\n");
 		return -ENOMEM;
-	chan->id = 0;
-	chan->dev = dev;
-	chan->file_priv = (struct drm_file *)-2;
-	dev_priv->fifos[0] = dev_priv->fifos[127] = chan;
+	}
 
-	/* Channel's PRAMIN object + heap */
-	ret = nouveau_gpuobj_new_fake(dev, 0, c_offset, c_size, 0,
-							NULL, &chan->ramin);
+	/* we need a channel to plug into the hw to control the BARs */
+	ret = nv50_channel_new(dev, 128*1024, &dev_priv->fifos[0]);
 	if (ret)
 		return ret;
+	chan = dev_priv->fifos[127] = dev_priv->fifos[0];
 
-	if (nouveau_mem_init_heap(&chan->ramin_heap, c_base, c_size - c_base))
-		return -ENOMEM;
-
-	/* RAMFC + zero channel's PRAMIN up to start of VM pagedir */
-	ret = nouveau_gpuobj_new_fake(dev, c_ramfc, c_offset + c_ramfc,
-						0x4000, 0, NULL, &chan->ramfc);
+	/* allocate page table for PRAMIN BAR */
+	ret = nouveau_gpuobj_new(dev, chan, (dev_priv->ramin_size >> 12) * 8,
+				 0x1000, NVOBJ_FLAG_ZERO_ALLOC,
+				 &priv->pramin_pt);
 	if (ret)
 		return ret;
 
-	for (i = 0; i < c_vmpd; i += 4)
-		BAR0_WI32(chan->ramin->gpuobj, i, 0);
+	nv_wo32(chan->vm_pd, 0x0000, priv->pramin_pt->vinst | 0x63);
+	nv_wo32(chan->vm_pd, 0x0004, 0);
 
-	/* VM page directory */
-	ret = nouveau_gpuobj_new_fake(dev, c_vmpd, c_offset + c_vmpd,
-					   0x4000, 0, &chan->vm_pd, NULL);
+	/* DMA object for PRAMIN BAR */
+	ret = nouveau_gpuobj_new(dev, chan, 6*4, 16, 0, &priv->pramin_bar);
 	if (ret)
 		return ret;
-	for (i = 0; i < 0x4000; i += 8) {
-		BAR0_WI32(chan->vm_pd, i + 0x00, 0x00000000);
-		BAR0_WI32(chan->vm_pd, i + 0x04, 0x00000000);
-	}
-
-	/* PRAMIN page table, cheat and map into VM at 0x0000000000.
-	 * We map the entire fake channel into the start of the PRAMIN BAR
-	 */
-	ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0, pt_size, 0x1000,
-				     0, &priv->pramin_pt);
+	nv_wo32(priv->pramin_bar, 0x00, 0x7fc00000);
+	nv_wo32(priv->pramin_bar, 0x04, dev_priv->ramin_size - 1);
+	nv_wo32(priv->pramin_bar, 0x08, 0x00000000);
+	nv_wo32(priv->pramin_bar, 0x0c, 0x00000000);
+	nv_wo32(priv->pramin_bar, 0x10, 0x00000000);
+	nv_wo32(priv->pramin_bar, 0x14, 0x00000000);
+
+	/* map channel into PRAMIN, gpuobj didn't do it for us */
+	ret = nv50_instmem_bind(dev, chan->ramin);
 	if (ret)
 		return ret;
 
-	v = c_offset | 1;
-	if (dev_priv->vram_sys_base) {
-		v += dev_priv->vram_sys_base;
-		v |= 0x30;
-	}
+	/* poke regs... */
+	nv_wr32(dev, 0x001704, 0x00000000 | (chan->ramin->vinst >> 12));
+	nv_wr32(dev, 0x001704, 0x40000000 | (chan->ramin->vinst >> 12));
+	nv_wr32(dev, 0x00170c, 0x80000000 | (priv->pramin_bar->cinst >> 4));
 
-	i = 0;
-	while (v < dev_priv->vram_sys_base + c_offset + c_size) {
-		BAR0_WI32(priv->pramin_pt->gpuobj, i + 0, lower_32_bits(v));
-		BAR0_WI32(priv->pramin_pt->gpuobj, i + 4, upper_32_bits(v));
-		v += 0x1000;
-		i += 8;
+	tmp = nv_ri32(dev, 0);
+	nv_wi32(dev, 0, ~tmp);
+	if (nv_ri32(dev, 0) != ~tmp) {
+		NV_ERROR(dev, "PRAMIN readback failed\n");
+		return -EIO;
 	}
+	nv_wi32(dev, 0, tmp);
 
-	while (i < pt_size) {
-		BAR0_WI32(priv->pramin_pt->gpuobj, i + 0, 0x00000000);
-		BAR0_WI32(priv->pramin_pt->gpuobj, i + 4, 0x00000000);
-		i += 8;
-	}
+	dev_priv->ramin_available = true;
+
+	/* Determine VM layout */
+	dev_priv->vm_gart_base = roundup(NV50_VM_BLOCK, NV50_VM_BLOCK);
+	dev_priv->vm_gart_size = NV50_VM_BLOCK;
+
+	dev_priv->vm_vram_base = dev_priv->vm_gart_base + dev_priv->vm_gart_size;
+	dev_priv->vm_vram_size = dev_priv->vram_size;
+	if (dev_priv->vm_vram_size > NV50_VM_MAX_VRAM)
+		dev_priv->vm_vram_size = NV50_VM_MAX_VRAM;
+	dev_priv->vm_vram_size = roundup(dev_priv->vm_vram_size, NV50_VM_BLOCK);
+	dev_priv->vm_vram_pt_nr = dev_priv->vm_vram_size / NV50_VM_BLOCK;
+
+	dev_priv->vm_end = dev_priv->vm_vram_base + dev_priv->vm_vram_size;
 
-	BAR0_WI32(chan->vm_pd, 0x00, priv->pramin_pt->instance | 0x63);
-	BAR0_WI32(chan->vm_pd, 0x04, 0x00000000);
+	NV_DEBUG(dev, "NV50VM: GART 0x%016llx-0x%016llx\n",
+		 dev_priv->vm_gart_base,
+		 dev_priv->vm_gart_base + dev_priv->vm_gart_size - 1);
+	NV_DEBUG(dev, "NV50VM: VRAM 0x%016llx-0x%016llx\n",
+		 dev_priv->vm_vram_base,
+		 dev_priv->vm_vram_base + dev_priv->vm_vram_size - 1);
 
 	/* VRAM page table(s), mapped into VM at +1GiB  */
 	for (i = 0; i < dev_priv->vm_vram_pt_nr; i++) {
-		ret = nouveau_gpuobj_new_ref(dev, chan, NULL, 0,
-					     NV50_VM_BLOCK/65536*8, 0, 0,
-					     &chan->vm_vram_pt[i]);
+		ret = nouveau_gpuobj_new(dev, NULL, NV50_VM_BLOCK / 0x10000 * 8,
+					 0, NVOBJ_FLAG_ZERO_ALLOC,
+					 &chan->vm_vram_pt[i]);
 		if (ret) {
-			NV_ERROR(dev, "Error creating VRAM page tables: %d\n",
-									ret);
+			NV_ERROR(dev, "Error creating VRAM PGT: %d\n", ret);
 			dev_priv->vm_vram_pt_nr = i;
 			return ret;
 		}
-		dev_priv->vm_vram_pt[i] = chan->vm_vram_pt[i]->gpuobj;
-
-		for (v = 0; v < dev_priv->vm_vram_pt[i]->im_pramin->size;
-								v += 4)
-			BAR0_WI32(dev_priv->vm_vram_pt[i], v, 0);
+		dev_priv->vm_vram_pt[i] = chan->vm_vram_pt[i];
 
-		BAR0_WI32(chan->vm_pd, 0x10 + (i*8),
-			  chan->vm_vram_pt[i]->instance | 0x61);
-		BAR0_WI32(chan->vm_pd, 0x14 + (i*8), 0);
+		nv_wo32(chan->vm_pd, 0x10 + (i*8),
+			chan->vm_vram_pt[i]->vinst | 0x61);
+		nv_wo32(chan->vm_pd, 0x14 + (i*8), 0);
 	}
 
-	/* DMA object for PRAMIN BAR */
-	ret = nouveau_gpuobj_new_ref(dev, chan, chan, 0, 6*4, 16, 0,
-							&priv->pramin_bar);
-	if (ret)
-		return ret;
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x00, 0x7fc00000);
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x04, dev_priv->ramin_size - 1);
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x08, 0x00000000);
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x0c, 0x00000000);
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x10, 0x00000000);
-	BAR0_WI32(priv->pramin_bar->gpuobj, 0x14, 0x00000000);
-
 	/* DMA object for FB BAR */
-	ret = nouveau_gpuobj_new_ref(dev, chan, chan, 0, 6*4, 16, 0,
-							&priv->fb_bar);
+	ret = nouveau_gpuobj_new(dev, chan, 6*4, 16, 0, &priv->fb_bar);
 	if (ret)
 		return ret;
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x00, 0x7fc00000);
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x04, 0x40000000 +
-					      drm_get_resource_len(dev, 1) - 1);
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x08, 0x40000000);
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x0c, 0x00000000);
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x10, 0x00000000);
-	BAR0_WI32(priv->fb_bar->gpuobj, 0x14, 0x00000000);
+	nv_wo32(priv->fb_bar, 0x00, 0x7fc00000);
+	nv_wo32(priv->fb_bar, 0x04, 0x40000000 +
+				    pci_resource_len(dev->pdev, 1) - 1);
+	nv_wo32(priv->fb_bar, 0x08, 0x40000000);
+	nv_wo32(priv->fb_bar, 0x0c, 0x00000000);
+	nv_wo32(priv->fb_bar, 0x10, 0x00000000);
+	nv_wo32(priv->fb_bar, 0x14, 0x00000000);
 
-	/* Poke the relevant regs, and pray it works :) */
-	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->instance >> 12));
-	nv_wr32(dev, NV50_PUNK_UNK1710, 0);
-	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->instance >> 12) |
-					 NV50_PUNK_BAR_CFG_BASE_VALID);
-	nv_wr32(dev, NV50_PUNK_BAR1_CTXDMA, (priv->fb_bar->instance >> 4) |
-					NV50_PUNK_BAR1_CTXDMA_VALID);
-	nv_wr32(dev, NV50_PUNK_BAR3_CTXDMA, (priv->pramin_bar->instance >> 4) |
-					NV50_PUNK_BAR3_CTXDMA_VALID);
+	dev_priv->engine.instmem.flush(dev);
 
+	nv_wr32(dev, 0x001708, 0x80000000 | (priv->fb_bar->cinst >> 4));
 	for (i = 0; i < 8; i++)
 		nv_wr32(dev, 0x1900 + (i*4), 0);
 
-	/* Assume that praying isn't enough, check that we can re-read the
-	 * entire fake channel back from the PRAMIN BAR */
-	dev_priv->engine.instmem.prepare_access(dev, false);
-	for (i = 0; i < c_size; i += 4) {
-		if (nv_rd32(dev, NV_RAMIN + i) != nv_ri32(dev, i)) {
-			NV_ERROR(dev, "Error reading back PRAMIN at 0x%08x\n",
-									i);
-			dev_priv->engine.instmem.finish_access(dev);
-			return -EINVAL;
-		}
-	}
-	dev_priv->engine.instmem.finish_access(dev);
-
-	nv_wr32(dev, NV50_PUNK_BAR0_PRAMIN, save_nv001700);
-
-	/* Global PRAMIN heap */
-	if (nouveau_mem_init_heap(&dev_priv->ramin_heap,
-				  c_size, dev_priv->ramin_size - c_size)) {
-		dev_priv->ramin_heap = NULL;
-		NV_ERROR(dev, "Failed to init RAMIN heap\n");
-	}
-
-	/*XXX: incorrect, but needed to make hash func "work" */
-	dev_priv->ramht_offset = 0x10000;
-	dev_priv->ramht_bits   = 9;
-	dev_priv->ramht_size   = (1 << dev_priv->ramht_bits);
 	return 0;
 }
 
@@ -302,29 +248,24 @@ nv50_instmem_takedown(struct drm_device *dev)
 	if (!priv)
 		return;
 
+	dev_priv->ramin_available = false;
+
 	/* Restore state from before init */
 	for (i = 0x1700; i <= 0x1710; i += 4)
 		nv_wr32(dev, i, priv->save1700[(i - 0x1700) / 4]);
 
-	nouveau_gpuobj_ref_del(dev, &priv->fb_bar);
-	nouveau_gpuobj_ref_del(dev, &priv->pramin_bar);
-	nouveau_gpuobj_ref_del(dev, &priv->pramin_pt);
+	nouveau_gpuobj_ref(NULL, &priv->fb_bar);
+	nouveau_gpuobj_ref(NULL, &priv->pramin_bar);
+	nouveau_gpuobj_ref(NULL, &priv->pramin_pt);
 
 	/* Destroy dummy channel */
 	if (chan) {
-		for (i = 0; i < dev_priv->vm_vram_pt_nr; i++) {
-			nouveau_gpuobj_ref_del(dev, &chan->vm_vram_pt[i]);
-			dev_priv->vm_vram_pt[i] = NULL;
-		}
+		for (i = 0; i < dev_priv->vm_vram_pt_nr; i++)
+			nouveau_gpuobj_ref(NULL, &chan->vm_vram_pt[i]);
 		dev_priv->vm_vram_pt_nr = 0;
 
-		nouveau_gpuobj_del(dev, &chan->vm_pd);
-		nouveau_gpuobj_ref_del(dev, &chan->ramfc);
-		nouveau_gpuobj_ref_del(dev, &chan->ramin);
-		nouveau_mem_takedown(&chan->ramin_heap);
-
-		dev_priv->fifos[0] = dev_priv->fifos[127] = NULL;
-		kfree(chan);
+		nv50_channel_del(&dev_priv->fifos[0]);
+		dev_priv->fifos[127] = NULL;
 	}
 
 	dev_priv->engine.instmem.priv = NULL;
@@ -336,14 +277,14 @@ nv50_instmem_suspend(struct drm_device *dev)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_channel *chan = dev_priv->fifos[0];
-	struct nouveau_gpuobj *ramin = chan->ramin->gpuobj;
+	struct nouveau_gpuobj *ramin = chan->ramin;
 	int i;
 
-	ramin->im_backing_suspend = vmalloc(ramin->im_pramin->size);
+	ramin->im_backing_suspend = vmalloc(ramin->size);
 	if (!ramin->im_backing_suspend)
 		return -ENOMEM;
 
-	for (i = 0; i < ramin->im_pramin->size; i += 4)
+	for (i = 0; i < ramin->size; i += 4)
 		ramin->im_backing_suspend[i/4] = nv_ri32(dev, i);
 	return 0;
 }
@@ -354,23 +295,25 @@ nv50_instmem_resume(struct drm_device *dev)
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nv50_instmem_priv *priv = dev_priv->engine.instmem.priv;
 	struct nouveau_channel *chan = dev_priv->fifos[0];
-	struct nouveau_gpuobj *ramin = chan->ramin->gpuobj;
+	struct nouveau_gpuobj *ramin = chan->ramin;
 	int i;
 
-	nv_wr32(dev, NV50_PUNK_BAR0_PRAMIN, (ramin->im_backing_start >> 16));
-	for (i = 0; i < ramin->im_pramin->size; i += 4)
-		BAR0_WI32(ramin, i, ramin->im_backing_suspend[i/4]);
+	dev_priv->ramin_available = false;
+	dev_priv->ramin_base = ~0;
+	for (i = 0; i < ramin->size; i += 4)
+		nv_wo32(ramin, i, ramin->im_backing_suspend[i/4]);
+	dev_priv->ramin_available = true;
 	vfree(ramin->im_backing_suspend);
 	ramin->im_backing_suspend = NULL;
 
 	/* Poke the relevant regs, and pray it works :) */
-	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->instance >> 12));
+	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->vinst >> 12));
 	nv_wr32(dev, NV50_PUNK_UNK1710, 0);
-	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->instance >> 12) |
+	nv_wr32(dev, NV50_PUNK_BAR_CFG_BASE, (chan->ramin->vinst >> 12) |
 					 NV50_PUNK_BAR_CFG_BASE_VALID);
-	nv_wr32(dev, NV50_PUNK_BAR1_CTXDMA, (priv->fb_bar->instance >> 4) |
+	nv_wr32(dev, NV50_PUNK_BAR1_CTXDMA, (priv->fb_bar->cinst >> 4) |
 					NV50_PUNK_BAR1_CTXDMA_VALID);
-	nv_wr32(dev, NV50_PUNK_BAR3_CTXDMA, (priv->pramin_bar->instance >> 4) |
+	nv_wr32(dev, NV50_PUNK_BAR3_CTXDMA, (priv->pramin_bar->cinst >> 4) |
 					NV50_PUNK_BAR3_CTXDMA_VALID);
 
 	for (i = 0; i < 8; i++)
@@ -386,7 +329,7 @@ nv50_instmem_populate(struct drm_device *dev, struct nouveau_gpuobj *gpuobj,
 	if (gpuobj->im_backing)
 		return -EINVAL;
 
-	*sz = ALIGN(*sz, NV50_INSTMEM_PAGE_SIZE);
+	*sz = ALIGN(*sz, 4096);
 	if (*sz == 0)
 		return -EINVAL;
 
@@ -404,9 +347,7 @@ nv50_instmem_populate(struct drm_device *dev, struct nouveau_gpuobj *gpuobj,
 		return ret;
 	}
 
-	gpuobj->im_backing_start = gpuobj->im_backing->bo.mem.mm_node->start;
-	gpuobj->im_backing_start <<= PAGE_SHIFT;
-
+	gpuobj->vinst = gpuobj->im_backing->bo.mem.mm_node->start << PAGE_SHIFT;
 	return 0;
 }
 
@@ -429,23 +370,23 @@ nv50_instmem_bind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 {
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nv50_instmem_priv *priv = dev_priv->engine.instmem.priv;
-	struct nouveau_gpuobj *pramin_pt = priv->pramin_pt->gpuobj;
+	struct nouveau_gpuobj *pramin_pt = priv->pramin_pt;
 	uint32_t pte, pte_end;
 	uint64_t vram;
 
 	if (!gpuobj->im_backing || !gpuobj->im_pramin || gpuobj->im_bound)
 		return -EINVAL;
 
-	NV_DEBUG(dev, "st=0x%0llx sz=0x%0llx\n",
+	NV_DEBUG(dev, "st=0x%lx sz=0x%lx\n",
 		 gpuobj->im_pramin->start, gpuobj->im_pramin->size);
 
 	pte     = (gpuobj->im_pramin->start >> 12) << 1;
 	pte_end = ((gpuobj->im_pramin->size >> 12) << 1) + pte;
-	vram    = gpuobj->im_backing_start;
+	vram    = gpuobj->vinst;
 
-	NV_DEBUG(dev, "pramin=0x%llx, pte=%d, pte_end=%d\n",
+	NV_DEBUG(dev, "pramin=0x%lx, pte=%d, pte_end=%d\n",
 		 gpuobj->im_pramin->start, pte, pte_end);
-	NV_DEBUG(dev, "first vram page: 0x%08x\n", gpuobj->im_backing_start);
+	NV_DEBUG(dev, "first vram page: 0x%010llx\n", gpuobj->vinst);
 
 	vram |= 1;
 	if (dev_priv->vram_sys_base) {
@@ -453,27 +394,16 @@ nv50_instmem_bind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 		vram |= 0x30;
 	}
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	while (pte < pte_end) {
-		nv_wo32(dev, pramin_pt, pte++, lower_32_bits(vram));
-		nv_wo32(dev, pramin_pt, pte++, upper_32_bits(vram));
-		vram += NV50_INSTMEM_PAGE_SIZE;
-	}
-	dev_priv->engine.instmem.finish_access(dev);
-
-	nv_wr32(dev, 0x100c80, 0x00040001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (1)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
+		nv_wo32(pramin_pt, (pte * 4) + 0, lower_32_bits(vram));
+		nv_wo32(pramin_pt, (pte * 4) + 4, upper_32_bits(vram));
+		vram += 0x1000;
+		pte += 2;
 	}
+	dev_priv->engine.instmem.flush(dev);
 
-	nv_wr32(dev, 0x100c80, 0x00060001);
-	if (!nv_wait(0x100c80, 0x00000001, 0x00000000)) {
-		NV_ERROR(dev, "timeout: (0x100c80 & 1) == 0 (2)\n");
-		NV_ERROR(dev, "0x100c80 = 0x%08x\n", nv_rd32(dev, 0x100c80));
-		return -EBUSY;
-	}
+	nv50_vm_flush(dev, 4);
+	nv50_vm_flush(dev, 6);
 
 	gpuobj->im_bound = 1;
 	return 0;
@@ -489,39 +419,44 @@ nv50_instmem_unbind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
 	if (gpuobj->im_bound == 0)
 		return -EINVAL;
 
+	/* can happen during late takedown */
+	if (unlikely(!dev_priv->ramin_available))
+		return 0;
+
 	pte     = (gpuobj->im_pramin->start >> 12) << 1;
 	pte_end = ((gpuobj->im_pramin->size >> 12) << 1) + pte;
 
-	dev_priv->engine.instmem.prepare_access(dev, true);
 	while (pte < pte_end) {
-		nv_wo32(dev, priv->pramin_pt->gpuobj, pte++, 0x00000000);
-		nv_wo32(dev, priv->pramin_pt->gpuobj, pte++, 0x00000000);
+		nv_wo32(priv->pramin_pt, (pte * 4) + 0, 0x00000000);
+		nv_wo32(priv->pramin_pt, (pte * 4) + 4, 0x00000000);
+		pte += 2;
 	}
-	dev_priv->engine.instmem.finish_access(dev);
+	dev_priv->engine.instmem.flush(dev);
 
 	gpuobj->im_bound = 0;
 	return 0;
 }
 
 void
-nv50_instmem_prepare_access(struct drm_device *dev, bool write)
+nv50_instmem_flush(struct drm_device *dev)
 {
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nv50_instmem_priv *priv = dev_priv->engine.instmem.priv;
-
-	priv->last_access_wr = write;
+	nv_wr32(dev, 0x00330c, 0x00000001);
+	if (!nv_wait(dev, 0x00330c, 0x00000002, 0x00000000))
+		NV_ERROR(dev, "PRAMIN flush timeout\n");
 }
 
 void
-nv50_instmem_finish_access(struct drm_device *dev)
+nv84_instmem_flush(struct drm_device *dev)
 {
-	struct drm_nouveau_private *dev_priv = dev->dev_private;
-	struct nv50_instmem_priv *priv = dev_priv->engine.instmem.priv;
-
-	if (priv->last_access_wr) {
-		nv_wr32(dev, 0x070000, 0x00000001);
-		if (!nv_wait(0x070000, 0x00000001, 0x00000000))
-			NV_ERROR(dev, "PRAMIN flush timeout\n");
-	}
+	nv_wr32(dev, 0x070000, 0x00000001);
+	if (!nv_wait(dev, 0x070000, 0x00000002, 0x00000000))
+		NV_ERROR(dev, "PRAMIN flush timeout\n");
 }
 
+void
+nv50_vm_flush(struct drm_device *dev, int engine)
+{
+	nv_wr32(dev, 0x100c80, (engine << 16) | 1);
+	if (!nv_wait(dev, 0x100c80, 0x00000001, 0x00000000))
+		NV_ERROR(dev, "vm flush timeout: engine %d\n", engine);
+}
diff --git a/drivers/gpu/drm/nouveau/nv50_sor.c b/drivers/gpu/drm/nouveau/nv50_sor.c
index 812778d..b4a5ecb 100644
--- a/drivers/gpu/drm/nouveau/nv50_sor.c
+++ b/drivers/gpu/drm/nouveau/nv50_sor.c
@@ -37,52 +37,32 @@
 #include "nv50_display.h"
 
 static void
-nv50_sor_disconnect(struct nouveau_encoder *nv_encoder)
+nv50_sor_disconnect(struct drm_encoder *encoder)
 {
-	struct drm_device *dev = to_drm_encoder(nv_encoder)->dev;
+	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
+	struct drm_device *dev = encoder->dev;
 	struct drm_nouveau_private *dev_priv = dev->dev_private;
 	struct nouveau_channel *evo = dev_priv->evo;
 	int ret;
 
+	if (!nv_encoder->crtc)
+		return;
+	nv50_crtc_blank(nouveau_crtc(nv_encoder->crtc), true);
+
 	NV_DEBUG_KMS(dev, "Disconnecting SOR %d\n", nv_encoder->or);
 
-	ret = RING_SPACE(evo, 2);
+	ret = RING_SPACE(evo, 4);
 	if (ret) {
 		NV_ERROR(dev, "no space while disconnecting SOR\n");
 		return;
 	}
 	BEGIN_RING(evo, 0, NV50_EVO_SOR(nv_encoder->or, MODE_CTRL), 1);
-	OUT_RING(evo, 0);
-}
-
-static void
-nv50_sor_dp_link_train(struct drm_encoder *encoder)
-{
-	struct drm_device *dev = encoder->dev;
-	struct nouveau_encoder *nv_encoder = nouveau_encoder(encoder);
-	struct bit_displayport_encoder_table *dpe;
-	int dpe_headerlen;
-
-	dpe = nouveau_bios_dp_table(dev, nv_encoder->dcb, &dpe_headerlen);
-	if (!dpe) {
-		NV_ERROR(dev, "SOR-%d: no DP encoder table!\n", nv_encoder->or);
-		return;
-	}
+	OUT_RING  (evo, 0);
+	BEGIN_RING(evo, 0, NV50_EVO_UPDATE, 1);
+	OUT_RING  (evo, 0);
 
-	if (dpe->script0) {
-		NV_DEBUG_KMS(dev, "SOR-%d: running DP script 0\n", nv_encoder->or);
-		nouveau_bios_run_init_table(dev, le16_to_cpu(dpe->script0),
-					    nv_encoder->dcb);
-	}
-
-	if (!nouveau_dp_link_train(encoder))
-		NV_ERROR(dev, "SOR-%d: link training failed\n", nv_encoder->or);
-
-	if (dpe->script1) {
-		NV_DEBUG_KMS(dev, "SOR-%d: running DP script 1\n", nv_encoder->or);
-		nouveau_bios_run_init_table(dev, le16_to_cpu(dpe->script1),
-					    nv_encoder->dcb);
-	}
+	nv_encoder->crtc = NULL;
+	nv_encoder->last_dpms = DRM_MODE_DPMS_OFF;
 }
 
 static void
@@ -94,14 +74,16 @@ nv50_sor_dpms(struct drm_encoder *encoder, int mode)
 	uint32_t val;
 	int or = nv_encoder->or;
 
-	NV_DEBUG_KMS(dev, "or %d mode %d\n", or, mode);
+	NV_DEBUG_KMS(dev, "or %d type %d mode %d\n", or, nv_encoder->dcb->type, mode);
 
 	nv_encoder->last_dpms = mode;
 	list_for_each_entry(enc, &dev->mode_config.encoder_list, head) {
 		struct nouveau_encoder *nvenc = nouveau_encoder(enc);
 
 		if (nvenc == nv_encoder ||
-		    nvenc->disconnect != nv50_sor_disconnect ||
+		    (nvenc->dcb->type != OUTPUT_TMDS &&
+		     nvenc->dcb->type != OUTPUT_LVDS &&
+		     nvenc->dcb->type != OUTPUT_DP) ||
 		    nvenc->dcb->or != nv_encoder->dcb->or)
 			continue;
 
@@ -110,7 +92,7 @@ nv50_sor_dpms(struct drm_encoder *encoder, int mode)
 	}
 
 	/* wait for it to be done */
-	if (!nv_wait(NV50_PDISPLAY_SOR_DPMS_CTRL(or),
+	if (!nv_wait(dev, NV50_PDISPLAY_SOR_DPMS_CTRL(or),
 		     NV50_PDISPLAY_SOR_DPMS_CTRL_PENDING, 0)) {
 		NV_ERROR(dev, "timeout: SOR_DPMS_CTRL_PENDING(%d) == 0\n", or);
 		NV_ERROR(dev, "SOR_DPMS_CTRL(%d) = 0x%08x\n", or,
@@ -126,15 +108,29 @@ nv50_sor_dpms(struct drm_encoder *encoder, int mode)
 
 	nv_wr32(dev, NV50_PDISPLAY_SOR_DPMS_CTRL(or), val |
 		NV50_PDISPLAY_SOR_DPMS_CTRL_PENDING);
-	if (!nv_wait(NV50_PDISPLAY_SOR_DPMS_STATE(or),
+	if (!nv_wait(dev, NV50_PDISPLAY_SOR_DPMS_STATE(or),
 		     NV50_PDISPLAY_SOR_DPMS_STATE_WAIT, 0)) {
 		NV_ERROR(dev, "timeout: SOR_DPMS_STATE_WAIT(%d) == 0\n", or);
 		NV_ERROR(dev, "SOR_DPMS_STATE(%d) = 0x%08x\n", or,
 			 nv_rd32(dev, NV50_PDISPLAY_SOR_DPMS_STATE(or)));
 	}
 
-	if (nv_encoder->dcb->type == OUTPUT_DP && mode == DRM_MODE_DPMS_ON)
-		nv50_sor_dp_link_train(encoder);
+	if (nv_encoder->dcb->type == OUTPUT_DP) {
+		struct nouveau_i2c_chan *auxch;
+
+		auxch = nouveau_i2c_find(dev, nv_encoder->dcb->i2c_index);
+		if (!auxch)
+			return;
+
+		if (mode == DRM_MODE_DPMS_ON) {
+			u8 status = DP_SET_POWER_D0;
+			nouveau_dp_auxch(auxch, 8, DP_SET_POWER, &status, 1);
+			nouveau_dp_link_train(encoder);
+		} else {
+			u8 status = DP_SET_POWER_D3;
+			nouveau_dp_auxch(auxch, 8, DP_SET_POWER, &status, 1);
+		}
+	}
 }
 
 static void
@@ -196,7 +192,8 @@ nv50_sor_mode_set(struct drm_encoder *encoder, struct drm_display_mode *mode,
 	uint32_t mode_ctl = 0;
 	int ret;
 
-	NV_DEBUG_KMS(dev, "or %d\n", nv_encoder->or);
+	NV_DEBUG_KMS(dev, "or %d type %d -> crtc %d\n",
+		     nv_encoder->or, nv_encoder->dcb->type, crtc->index);
 
 	nv50_sor_dpms(encoder, DRM_MODE_DPMS_ON);
 
@@ -239,6 +236,14 @@ nv50_sor_mode_set(struct drm_encoder *encoder, struct drm_display_mode *mode,
 	}
 	BEGIN_RING(evo, 0, NV50_EVO_SOR(nv_encoder->or, MODE_CTRL), 1);
 	OUT_RING(evo, mode_ctl);
+
+	nv_encoder->crtc = encoder->crtc;
+}
+
+static struct drm_crtc *
+nv50_sor_crtc_get(struct drm_encoder *encoder)
+{
+	return nouveau_encoder(encoder)->crtc;
 }
 
 static const struct drm_encoder_helper_funcs nv50_sor_helper_funcs = {
@@ -249,7 +254,9 @@ static const struct drm_encoder_helper_funcs nv50_sor_helper_funcs = {
 	.prepare = nv50_sor_prepare,
 	.commit = nv50_sor_commit,
 	.mode_set = nv50_sor_mode_set,
-	.detect = NULL
+	.get_crtc = nv50_sor_crtc_get,
+	.detect = NULL,
+	.disable = nv50_sor_disconnect
 };
 
 static void
@@ -272,32 +279,22 @@ static const struct drm_encoder_funcs nv50_sor_encoder_funcs = {
 };
 
 int
-nv50_sor_create(struct drm_device *dev, struct dcb_entry *entry)
+nv50_sor_create(struct drm_connector *connector, struct dcb_entry *entry)
 {
 	struct nouveau_encoder *nv_encoder = NULL;
+	struct drm_device *dev = connector->dev;
 	struct drm_encoder *encoder;
-	bool dum;
 	int type;
 
 	NV_DEBUG_KMS(dev, "\n");
 
 	switch (entry->type) {
 	case OUTPUT_TMDS:
-		NV_INFO(dev, "Detected a TMDS output\n");
+	case OUTPUT_DP:
 		type = DRM_MODE_ENCODER_TMDS;
 		break;
 	case OUTPUT_LVDS:
-		NV_INFO(dev, "Detected a LVDS output\n");
 		type = DRM_MODE_ENCODER_LVDS;
-
-		if (nouveau_bios_parse_lvds_table(dev, 0, &dum, &dum)) {
-			NV_ERROR(dev, "Failed parsing LVDS table\n");
-			return -EINVAL;
-		}
-		break;
-	case OUTPUT_DP:
-		NV_INFO(dev, "Detected a DP output\n");
-		type = DRM_MODE_ENCODER_TMDS;
 		break;
 	default:
 		return -EINVAL;
@@ -310,8 +307,7 @@ nv50_sor_create(struct drm_device *dev, struct dcb_entry *entry)
 
 	nv_encoder->dcb = entry;
 	nv_encoder->or = ffs(entry->or) - 1;
-
-	nv_encoder->disconnect = nv50_sor_disconnect;
+	nv_encoder->last_dpms = DRM_MODE_DPMS_OFF;
 
 	drm_encoder_init(dev, encoder, &nv50_sor_encoder_funcs, type);
 	drm_encoder_helper_add(encoder, &nv50_sor_helper_funcs);
@@ -342,5 +338,6 @@ nv50_sor_create(struct drm_device *dev, struct dcb_entry *entry)
 			nv_encoder->dp.mc_unknown = 5;
 	}
 
+	drm_mode_connector_attach_encoder(connector, encoder);
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nvc0_fb.c b/drivers/gpu/drm/nouveau/nvc0_fb.c
new file mode 100644
index 0000000..26a9960
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvc0_fb.c
@@ -0,0 +1,38 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#include "drmP.h"
+
+#include "nouveau_drv.h"
+
+int
+nvc0_fb_init(struct drm_device *dev)
+{
+	return 0;
+}
+
+void
+nvc0_fb_takedown(struct drm_device *dev)
+{
+}
diff --git a/drivers/gpu/drm/nouveau/nvc0_fifo.c b/drivers/gpu/drm/nouveau/nvc0_fifo.c
new file mode 100644
index 0000000..2cdb7c3
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvc0_fifo.c
@@ -0,0 +1,89 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#include "drmP.h"
+
+#include "nouveau_drv.h"
+
+void
+nvc0_fifo_disable(struct drm_device *dev)
+{
+}
+
+void
+nvc0_fifo_enable(struct drm_device *dev)
+{
+}
+
+bool
+nvc0_fifo_reassign(struct drm_device *dev, bool enable)
+{
+	return false;
+}
+
+bool
+nvc0_fifo_cache_pull(struct drm_device *dev, bool enable)
+{
+	return false;
+}
+
+int
+nvc0_fifo_channel_id(struct drm_device *dev)
+{
+	return 127;
+}
+
+int
+nvc0_fifo_create_context(struct nouveau_channel *chan)
+{
+	return 0;
+}
+
+void
+nvc0_fifo_destroy_context(struct nouveau_channel *chan)
+{
+}
+
+int
+nvc0_fifo_load_context(struct nouveau_channel *chan)
+{
+	return 0;
+}
+
+int
+nvc0_fifo_unload_context(struct drm_device *dev)
+{
+	return 0;
+}
+
+void
+nvc0_fifo_takedown(struct drm_device *dev)
+{
+}
+
+int
+nvc0_fifo_init(struct drm_device *dev)
+{
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nvc0_graph.c b/drivers/gpu/drm/nouveau/nvc0_graph.c
new file mode 100644
index 0000000..edf2b21
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvc0_graph.c
@@ -0,0 +1,74 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#include "drmP.h"
+
+#include "nouveau_drv.h"
+
+void
+nvc0_graph_fifo_access(struct drm_device *dev, bool enabled)
+{
+}
+
+struct nouveau_channel *
+nvc0_graph_channel(struct drm_device *dev)
+{
+	return NULL;
+}
+
+int
+nvc0_graph_create_context(struct nouveau_channel *chan)
+{
+	return 0;
+}
+
+void
+nvc0_graph_destroy_context(struct nouveau_channel *chan)
+{
+}
+
+int
+nvc0_graph_load_context(struct nouveau_channel *chan)
+{
+	return 0;
+}
+
+int
+nvc0_graph_unload_context(struct drm_device *dev)
+{
+	return 0;
+}
+
+void
+nvc0_graph_takedown(struct drm_device *dev)
+{
+}
+
+int
+nvc0_graph_init(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	dev_priv->engine.graph.accel_blocked = true;
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nvc0_instmem.c b/drivers/gpu/drm/nouveau/nvc0_instmem.c
new file mode 100644
index 0000000..152d8e8
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvc0_instmem.c
@@ -0,0 +1,229 @@
+/*
+ * Copyright 2010 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs
+ */
+
+#include "drmP.h"
+
+#include "nouveau_drv.h"
+
+int
+nvc0_instmem_populate(struct drm_device *dev, struct nouveau_gpuobj *gpuobj,
+		      uint32_t *size)
+{
+	int ret;
+
+	*size = ALIGN(*size, 4096);
+	if (*size == 0)
+		return -EINVAL;
+
+	ret = nouveau_bo_new(dev, NULL, *size, 0, TTM_PL_FLAG_VRAM, 0, 0x0000,
+			     true, false, &gpuobj->im_backing);
+	if (ret) {
+		NV_ERROR(dev, "error getting PRAMIN backing pages: %d\n", ret);
+		return ret;
+	}
+
+	ret = nouveau_bo_pin(gpuobj->im_backing, TTM_PL_FLAG_VRAM);
+	if (ret) {
+		NV_ERROR(dev, "error pinning PRAMIN backing VRAM: %d\n", ret);
+		nouveau_bo_ref(NULL, &gpuobj->im_backing);
+		return ret;
+	}
+
+	gpuobj->vinst = gpuobj->im_backing->bo.mem.mm_node->start << PAGE_SHIFT;
+	return 0;
+}
+
+void
+nvc0_instmem_clear(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+
+	if (gpuobj && gpuobj->im_backing) {
+		if (gpuobj->im_bound)
+			dev_priv->engine.instmem.unbind(dev, gpuobj);
+		nouveau_bo_unpin(gpuobj->im_backing);
+		nouveau_bo_ref(NULL, &gpuobj->im_backing);
+		gpuobj->im_backing = NULL;
+	}
+}
+
+int
+nvc0_instmem_bind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	uint32_t pte, pte_end;
+	uint64_t vram;
+
+	if (!gpuobj->im_backing || !gpuobj->im_pramin || gpuobj->im_bound)
+		return -EINVAL;
+
+	NV_DEBUG(dev, "st=0x%lx sz=0x%lx\n",
+		 gpuobj->im_pramin->start, gpuobj->im_pramin->size);
+
+	pte     = gpuobj->im_pramin->start >> 12;
+	pte_end = (gpuobj->im_pramin->size >> 12) + pte;
+	vram    = gpuobj->vinst;
+
+	NV_DEBUG(dev, "pramin=0x%lx, pte=%d, pte_end=%d\n",
+		 gpuobj->im_pramin->start, pte, pte_end);
+	NV_DEBUG(dev, "first vram page: 0x%010llx\n", gpuobj->vinst);
+
+	while (pte < pte_end) {
+		nv_wr32(dev, 0x702000 + (pte * 8), (vram >> 8) | 1);
+		nv_wr32(dev, 0x702004 + (pte * 8), 0);
+		vram += 4096;
+		pte++;
+	}
+	dev_priv->engine.instmem.flush(dev);
+
+	if (1) {
+		u32 chan = nv_rd32(dev, 0x1700) << 16;
+		nv_wr32(dev, 0x100cb8, (chan + 0x1000) >> 8);
+		nv_wr32(dev, 0x100cbc, 0x80000005);
+	}
+
+	gpuobj->im_bound = 1;
+	return 0;
+}
+
+int
+nvc0_instmem_unbind(struct drm_device *dev, struct nouveau_gpuobj *gpuobj)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	uint32_t pte, pte_end;
+
+	if (gpuobj->im_bound == 0)
+		return -EINVAL;
+
+	pte     = gpuobj->im_pramin->start >> 12;
+	pte_end = (gpuobj->im_pramin->size >> 12) + pte;
+	while (pte < pte_end) {
+		nv_wr32(dev, 0x702000 + (pte * 8), 0);
+		nv_wr32(dev, 0x702004 + (pte * 8), 0);
+		pte++;
+	}
+	dev_priv->engine.instmem.flush(dev);
+
+	gpuobj->im_bound = 0;
+	return 0;
+}
+
+void
+nvc0_instmem_flush(struct drm_device *dev)
+{
+	nv_wr32(dev, 0x070000, 1);
+	if (!nv_wait(dev, 0x070000, 0x00000002, 0x00000000))
+		NV_ERROR(dev, "PRAMIN flush timeout\n");
+}
+
+int
+nvc0_instmem_suspend(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 *buf;
+	int i;
+
+	dev_priv->susres.ramin_copy = vmalloc(65536);
+	if (!dev_priv->susres.ramin_copy)
+		return -ENOMEM;
+	buf = dev_priv->susres.ramin_copy;
+
+	for (i = 0; i < 65536; i += 4)
+		buf[i/4] = nv_rd32(dev, NV04_PRAMIN + i);
+	return 0;
+}
+
+void
+nvc0_instmem_resume(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u32 *buf = dev_priv->susres.ramin_copy;
+	u64 chan;
+	int i;
+
+	chan = dev_priv->vram_size - dev_priv->ramin_rsvd_vram;
+	nv_wr32(dev, 0x001700, chan >> 16);
+
+	for (i = 0; i < 65536; i += 4)
+		nv_wr32(dev, NV04_PRAMIN + i, buf[i/4]);
+	vfree(dev_priv->susres.ramin_copy);
+	dev_priv->susres.ramin_copy = NULL;
+
+	nv_wr32(dev, 0x001714, 0xc0000000 | (chan >> 12));
+}
+
+int
+nvc0_instmem_init(struct drm_device *dev)
+{
+	struct drm_nouveau_private *dev_priv = dev->dev_private;
+	u64 chan, pgt3, imem, lim3 = dev_priv->ramin_size - 1;
+	int ret, i;
+
+	dev_priv->ramin_rsvd_vram = 1 * 1024 * 1024;
+	chan = dev_priv->vram_size - dev_priv->ramin_rsvd_vram;
+	imem = 4096 + 4096 + 32768;
+
+	nv_wr32(dev, 0x001700, chan >> 16);
+
+	/* channel setup */
+	nv_wr32(dev, 0x700200, lower_32_bits(chan + 0x1000));
+	nv_wr32(dev, 0x700204, upper_32_bits(chan + 0x1000));
+	nv_wr32(dev, 0x700208, lower_32_bits(lim3));
+	nv_wr32(dev, 0x70020c, upper_32_bits(lim3));
+
+	/* point pgd -> pgt */
+	nv_wr32(dev, 0x701000, 0);
+	nv_wr32(dev, 0x701004, ((chan + 0x2000) >> 8) | 1);
+
+	/* point pgt -> physical vram for channel */
+	pgt3 = 0x2000;
+	for (i = 0; i < dev_priv->ramin_rsvd_vram; i += 4096, pgt3 += 8) {
+		nv_wr32(dev, 0x700000 + pgt3, ((chan + i) >> 8) | 1);
+		nv_wr32(dev, 0x700004 + pgt3, 0);
+	}
+
+	/* clear rest of pgt */
+	for (; i < dev_priv->ramin_size; i += 4096, pgt3 += 8) {
+		nv_wr32(dev, 0x700000 + pgt3, 0);
+		nv_wr32(dev, 0x700004 + pgt3, 0);
+	}
+
+	/* point bar3 at the channel */
+	nv_wr32(dev, 0x001714, 0xc0000000 | (chan >> 12));
+
+	/* Global PRAMIN heap */
+	ret = drm_mm_init(&dev_priv->ramin_heap, imem,
+			  dev_priv->ramin_size - imem);
+	if (ret) {
+		NV_ERROR(dev, "Failed to init RAMIN heap\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+nvc0_instmem_takedown(struct drm_device *dev)
+{
+}
diff --git a/drivers/gpu/drm/nouveau/nvreg.h b/drivers/gpu/drm/nouveau/nvreg.h
index 5998c35..881f8a5 100644
--- a/drivers/gpu/drm/nouveau/nvreg.h
+++ b/drivers/gpu/drm/nouveau/nvreg.h
@@ -147,28 +147,6 @@
 #	define NV_VIO_GX_DONT_CARE_INDEX	0x07
 #	define NV_VIO_GX_BIT_MASK_INDEX		0x08
 
-#define NV_PFB_BOOT_0			0x00100000
-#define NV_PFB_CFG0			0x00100200
-#define NV_PFB_CFG1			0x00100204
-#define NV_PFB_CSTATUS			0x0010020C
-#define NV_PFB_REFCTRL			0x00100210
-#	define NV_PFB_REFCTRL_VALID_1			(1 << 31)
-#define NV_PFB_PAD			0x0010021C
-#	define NV_PFB_PAD_CKE_NORMAL			(1 << 0)
-#define NV_PFB_TILE_NV10		0x00100240
-#define NV_PFB_TILE_SIZE_NV10		0x00100244
-#define NV_PFB_REF			0x001002D0
-#	define NV_PFB_REF_CMD_REFRESH			(1 << 0)
-#define NV_PFB_PRE			0x001002D4
-#	define NV_PFB_PRE_CMD_PRECHARGE			(1 << 0)
-#define NV_PFB_CLOSE_PAGE2		0x0010033C
-#define NV_PFB_TILE_NV40		0x00100600
-#define NV_PFB_TILE_SIZE_NV40		0x00100604
-
-#define NV_PEXTDEV_BOOT_0		0x00101000
-#	define NV_PEXTDEV_BOOT_0_STRAP_FP_IFACE_12BIT	(8 << 12)
-#define NV_PEXTDEV_BOOT_3		0x0010100c
-
 #define NV_PCRTC_INTR_0					0x00600100
 #	define NV_PCRTC_INTR_0_VBLANK				(1 << 0)
 #define NV_PCRTC_INTR_EN_0				0x00600140
@@ -285,6 +263,7 @@
 #		define NV_CIO_CRE_HCUR_ADDR1_ADR	7:2
 #	define NV_CIO_CRE_LCD__INDEX		0x33
 #		define NV_CIO_CRE_LCD_LCD_SELECT	0:0
+#		define NV_CIO_CRE_LCD_ROUTE_MASK	0x3b
 #	define NV_CIO_CRE_DDC0_STATUS__INDEX	0x36
 #	define NV_CIO_CRE_DDC0_WR__INDEX	0x37
 #	define NV_CIO_CRE_ILACE__INDEX		0x39	/* interlace */
-- 
1.7.3