Skip to content
Snippets Groups Projects
  1. Nov 13, 2018
    • Lukas Wunner's avatar
      PCI/ASPM: Fix link_state teardown on device removal · 19e14e88
      Lukas Wunner authored
      
      commit aeae4f3e upstream.
      
      Upon removal of the last device on a bus, the link_state of the bridge
      leading to that bus is sought to be torn down by having pci_stop_dev()
      call pcie_aspm_exit_link_state().
      
      When ASPM was originally introduced by commit 7d715a6c ("PCI: add
      PCI Express ASPM support"), it determined whether the device being
      removed is the last one by calling list_empty() on the bridge's
      subordinate devices list.  That didn't work because the device is only
      removed from the list slightly later in pci_destroy_dev().
      
      Commit 3419c75e ("PCI: properly clean up ASPM link state on device
      remove") attempted to fix it by calling list_is_last(), but that's not
      correct either because it checks whether the device is at the *end* of
      the list, not whether it's the last one *left* in the list.  If the user
      removes the device which happens to be at the end of the list via sysfs
      but other devices are preceding the device in the list, the link_state
      is torn down prematurely.
      
      The real fix is to move the invocation of pcie_aspm_exit_link_state() to
      pci_destroy_dev() and reinstate the call to list_empty().  Remove a
      duplicate check for dev->bus->self because pcie_aspm_exit_link_state()
      already contains an identical check.
      
      Fixes: 7d715a6c ("PCI: add PCI Express ASPM support")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: stable@vger.kernel.org # v2.6.26
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19e14e88
  2. Dec 20, 2017
    • Alex Williamson's avatar
      PCI: Detach driver before procfs & sysfs teardown on device remove · 96ed7ca7
      Alex Williamson authored
      
      
      [ Upstream commit 16b6c8bb ]
      
      When removing a device, for example a VF being removed due to SR-IOV
      teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual
      hot-unplug, we first remove the procfs and sysfs attributes for the device
      before attempting to release the device from any driver bound to it.
      Unbinding the driver from the device can take time.  The device might need
      to write out data or it might be actively in use.  If it's in use by
      userspace through a vfio driver, the unbind might block until the user
      releases the device.  This leads to a potentially non-trivial amount of
      time where the device exists, but we've torn down the interfaces that
      userspace uses to examine devices, for instance lspci might generate this
      sort of error:
      
        pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config
        lspci: Unable to read the standard configuration space header of device 0000:01:0a.3
      
      We don't seem to have any dependence on this teardown ordering in the
      kernel, so let's unbind the driver first, which is also more symmetric with
      the instantiation of the device in pci_bus_add_device().
      
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96ed7ca7
  3. Nov 18, 2016
    • Lukas Wunner's avatar
      PCI: Autosense device removal in pci_bridge_d3_update() · 1ed276a7
      Lukas Wunner authored
      
      The algorithm to update the flag indicating whether a bridge may go to D3
      makes a few optimizations based on whether the update was caused by the
      removal of a device on the one hand, versus the addition of a device or the
      change of its D3cold flags on the other hand.
      
      The information whether the update pertains to a removal is currently
      passed in by the caller, but the function may as well determine that itself
      by examining the device in question, thereby allowing for a considerable
      simplification and reduction of the code.
      
      Out of several options to determine removal, I've chosen the function
      device_is_registered() because it's cheap:  It merely returns the
      dev->kobj.state_in_sysfs flag.  That flag is set through device_add() when
      the root bus is scanned and cleared through device_remove().  The call to
      pci_bridge_d3_update() happens after each of these calls, respectively, so
      the ordering is correct.
      
      No functional change intended.
      
      Tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1ed276a7
  4. Sep 13, 2016
  5. Jun 13, 2016
    • Mika Westerberg's avatar
      PCI: Put PCIe ports into D3 during suspend · 9d26d3a8
      Mika Westerberg authored
      
      Currently the Linux PCI core does not touch power state of PCI bridges and
      PCIe ports when system suspend is entered.  Leaving them in D0 consumes
      power unnecessarily and may prevent the CPU from entering deeper C-states.
      
      With recent PCIe hardware we can power down the ports to save power given
      that we take into account few restrictions:
      
        - The PCIe port hardware is recent enough, starting from 2015.
      
        - Devices connected to PCIe ports are effectively in D3cold once the port
          is transitioned to D3 (the config space is not accessible anymore and
          the link may be powered down).
      
        - Devices behind the PCIe port need to be allowed to transition to D3cold
          and back.  There is a way both drivers and userspace can forbid this.
      
        - If the device behind the PCIe port is capable of waking the system it
          needs to be able to do so from D3cold.
      
      This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
      flag is set and cleared by the PCI core whenever there is a change in power
      management state of any of the devices behind the PCIe port.  When system
      later on is suspended we only need to check this flag and if it is true
      transition the port to D3 otherwise we leave it in D0.
      
      Also provide override mechanism via command line parameter
      "pcie_port_pm=[off|force]" that can be used to disable or enable the
      feature regardless of the BIOS manufacturing date.
      
      Tested-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9d26d3a8
  6. Mar 12, 2016
  7. Mar 08, 2016
  8. Aug 13, 2015
    • Bjorn Helgaas's avatar
      PCI: Embed ATS info directly into struct pci_dev · d544d75a
      Bjorn Helgaas authored
      
      The pci_ats struct is small and will get smaller, so I don't think it's
      worth allocating it separately from the pci_dev struct.
      
      Embed the ATS fields directly into struct pci_dev.
      
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarJoerg Roedel <jroedel@suse.de>
      d544d75a
    • Bjorn Helgaas's avatar
      PCI: Allocate ATS struct during enumeration · edc90fee
      Bjorn Helgaas authored
      Previously, we allocated pci_ats structures when an IOMMU driver called
      pci_enable_ats().  An SR-IOV VF shares the STU setting with its PF, so when
      enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
      didn't already have one.  We held the sriov->lock to serialize threads
      concurrently enabling ATS on several VFS so only one would allocate the PF
      pci_ats.
      
      Gregor reported a deadlock here:
      
        pci_enable_sriov
          sriov_enable
            virtfn_add
              mutex_lock(dev->sriov->lock)      # acquire sriov->lock
              pci_device_add
                device_add
                  BUS_NOTIFY_ADD_DEVICE notifier chain
                  iommu_bus_notifier
                    amd_iommu_add_device        # iommu_ops.add_device
                      init_iommu_group
                        iommu_group_get_for_dev
                          iommu_group_add_device
                            __iommu_attach_device
                              amd_iommu_attach_device  # iommu_ops.attach_device
                                attach_device
                                  pci_enable_ats
                                    mutex_lock(dev->sriov->lock) # deadlock
      
      There's no reason to delay allocating the pci_ats struct, and if we
      allocate it for each device at enumeration-time, there's no need for
      locking in pci_enable_ats().
      
      Allocate pci_ats struct during enumeration, when we initialize other
      capabilities.
      
      Note that this implementation requires ATS to be enabled on the PF first,
      before on any of the VFs because the PF controls the STU for all the VFs.
      
      Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
      
      
      Reported-by: default avatarGregor Dick <gdick@solarflare.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarJoerg Roedel <jroedel@suse.de>
      edc90fee
  9. Apr 08, 2015
  10. Feb 01, 2014
    • Rafael J. Wysocki's avatar
      Revert "PCI: Remove from bus_list and release resources in pci_release_dev()" · 04480094
      Rafael J. Wysocki authored
      
      Revert commit ef83b078 "PCI: Remove from bus_list and release
      resources in pci_release_dev()" that made some nasty race conditions
      become possible.  For example, if a Thunderbolt link is unplugged
      and then replugged immediately, the pci_release_dev() resulting from
      the hot-remove code path may be racing with the hot-add code path
      which after that commit causes various kinds of breakage to happen
      (up to and including a hard crash of the whole system).
      
      Moreover, the problem that commit ef83b078 attempted to address
      cannot happen any more after commit 8a4c5c32 "PCI: Check parent
      kobject in pci_destroy_dev()", because pci_destroy_dev() will now
      return immediately if it has already been executed for the given
      device.
      
      Note, however, that the invocation of msi_remove_pci_irq_vectors()
      removed by commit ef83b078 from pci_free_resources() along with
      the other changes made by it is not added back because of subsequent
      code changes depending on that modification.
      
      Fixes: ef83b078 (PCI: Remove from bus_list and release resources in pci_release_dev())
      Reported-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04480094
  11. Jan 15, 2014
    • Rafael J. Wysocki's avatar
      PCI: Check parent kobject in pci_destroy_dev() · 8a4c5c32
      Rafael J. Wysocki authored
      
      If pci_stop_and_remove_bus_device() is run concurrently for a device and
      its parent bridge via remove_callback(), both code paths attempt to acquire
      pci_rescan_remove_lock.  If the child device removal acquires it first,
      there will be no problems.  However, if the parent bridge removal acquires
      it first, it will eventually execute pci_destroy_dev() for the child
      device, but that device object will not be freed yet due to the reference
      held by the concurrent child removal.  Consequently, both
      pci_stop_bus_device() and pci_remove_bus_device() will be executed for that
      device unnecessarily and pci_destroy_dev() will see a corrupted list head
      in that object.  Moreover, an excess put_device() will be executed for that
      device in that case which may lead to a use-after-free in the final
      kobject_put() done by sysfs_schedule_callback_work().
      
      To avoid that problem, make pci_destroy_dev() check if the device's parent
      kobject is NULL, which only happens after device_del() has already run for
      it.  Make pci_destroy_dev() return immediately whithout doing anything in
      that case.
      
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      8a4c5c32
  12. Jan 14, 2014
    • Rafael J. Wysocki's avatar
      PCI: Add global pci_lock_rescan_remove() · 9d16947b
      Rafael J. Wysocki authored
      
      There are multiple PCI device addition and removal code paths that may be
      run concurrently with the generic PCI bus rescan and device removal that
      can be triggered via sysfs.  If that happens, it may lead to multiple
      different, potentially dangerous race conditions.
      
      The most straightforward way to address those problems is to run
      the code in question under the same lock that is used by the
      generic rescan/remove code in pci-sysfs.c.  To prepare for those
      changes, move the definition of the global PCI remove/rescan lock
      to probe.c and provide global wrappers, pci_lock_rescan_remove()
      and pci_unlock_rescan_remove(), allowing drivers to manipulate
      that lock.  Also provide pci_stop_and_remove_bus_device_locked()
      for the callers of pci_stop_and_remove_bus_device() who only need
      to hold the rescan/remove lock around it.
      
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      9d16947b
  13. Dec 18, 2013
    • Yinghai Lu's avatar
      PCI: Remove from bus_list and release resources in pci_release_dev() · ef83b078
      Yinghai Lu authored
      
      Previously we removed the pci_dev from the bus_list and released its
      resources in pci_destroy_dev().  But that's too early: it's possible to
      call pci_destroy_dev() twice for the same device (e.g., via sysfs), and
      that will cause an oops when we try to remove it from bus_list the second
      time.
      
      We should remove it from the bus_list only when the last reference to the
      pci_dev has been released, i.e., in pci_release_dev().
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      ef83b078
    • Yinghai Lu's avatar
      PCI: Use device_release_driver() in pci_stop_root_bus() · e3b439e1
      Yinghai Lu authored
      
      To be consistent with 4bff6749 ("PCI: Move device_del() from
      pci_stop_dev() to pci_destroy_dev()", this changes pci_stop_root_bus()
      to use device_release_driver() instead of device_del().
      
      This also changes pci_remove_root_bus() to use device_unregister()
      instead of put_device() so it corresponds with the device_register()
      call in pci_create_root_bus().
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e3b439e1
    • Rafael J. Wysocki's avatar
      PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev() · c4a0a5d9
      Rafael J. Wysocki authored
      After commit bcdde7e2 (sysfs: make __sysfs_remove_dir() recursive)
      I'm seeing traces analogous to the one below in Thunderbolt testing:
      
      WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
       sysfs group ffffffff81c6c500 not found for kobject '0000:08'
       Modules linked in: ...
       CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
       Hardware name: Acer Aspire S5-391/Venus    , BIOS V1.02 05/29/2012
       Workqueue: kacpi_hotplug acpi_hotplug_work_fn
        0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
        ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
        0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
       Call Trace:
        [<ffffffff816b23bf>] dump_stack+0x4e/0x71
        [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
        [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
        [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
        [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
        [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
        [<ffffffff81495818>] device_del+0x58/0x1c0
        [<ffffffff814959c8>] device_unregister+0x48/0x60
        [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
        [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
        [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
        [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
        [<ffffffff813418d0>] disable_slot+0x20/0xe0
        [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
        [<ffffffff813427ad>] hotplug_event+0x17d/0x220
        [<ffffffff81342880>] hotplug_event_work+0x30/0x70
        [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
        [<ffffffff81061331>] process_one_work+0x261/0x450
        [<ffffffff81061a7e>] worker_thread+0x21e/0x370
        [<ffffffff81061860>] ? rescuer_thread+0x300/0x300
        [<ffffffff81068342>] kthread+0xd2/0xe0
        [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
        [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
        [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
      
      (Mika Westerberg sees them too in his tests).
      
      Some investigation documented in kernel bug #65281 led me to the
      conclusion that the source of the problem is the device_del() in
      pci_stop_dev() as it now causes the sysfs directory of the device to be
      removed recursively along with all of its subdirectories.  That includes
      the sysfs directory of the device's subordinate bus (dev->subordinate) and
      its "power" group.
      
      Consequently, when pci_remove_bus() is called for dev->subordinate in
      pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
      point the sysfs directory of bus->dev doesn't exist any more and its
      "power" group doesn't exist either.  Thus, when dpm_sysfs_remove() called
      from device_del() tries to remove that group, it triggers the above
      warning.
      
      That indicates a logical mistake in the design of
      pci_stop_and_remove_bus_device(), which causes bus device objects to be
      left behind their parents (bridge device objects) and can be fixed by
      moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
      pci_remove_bus() can be called for the device's subordinate bus before the
      device itself is unregistered from the hierarchy.  Still, the driver, if
      any, should be detached from the device in pci_stop_dev(), so use
      device_release_driver() directly from there.
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
      
      
      Reported-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      c4a0a5d9
  14. Nov 25, 2013
    • Rafael J. Wysocki's avatar
      PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev() · 4bff6749
      Rafael J. Wysocki authored
      After commit bcdde7e2 (sysfs: make __sysfs_remove_dir() recursive)
      I'm seeing traces analogous to the one below in Thunderbolt testing:
      
      WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
       sysfs group ffffffff81c6c500 not found for kobject '0000:08'
       Modules linked in: ...
       CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
       Hardware name: Acer Aspire S5-391/Venus    , BIOS V1.02 05/29/2012
       Workqueue: kacpi_hotplug acpi_hotplug_work_fn
        0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
        ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
        0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
       Call Trace:
        [<ffffffff816b23bf>] dump_stack+0x4e/0x71
        [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
        [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
        [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
        [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
        [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
        [<ffffffff81495818>] device_del+0x58/0x1c0
        [<ffffffff814959c8>] device_unregister+0x48/0x60
        [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
        [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
        [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
        [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
        [<ffffffff813418d0>] disable_slot+0x20/0xe0
        [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
        [<ffffffff813427ad>] hotplug_event+0x17d/0x220
        [<ffffffff81342880>] hotplug_event_work+0x30/0x70
        [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
        [<ffffffff81061331>] process_one_work+0x261/0x450
        [<ffffffff81061a7e>] worker_thread+0x21e/0x370
        [<ffffffff81061860>] ? rescuer_thread+0x300/0x300
        [<ffffffff81068342>] kthread+0xd2/0xe0
        [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
        [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
        [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
      
      (Mika Westerberg sees them too in his tests).
      
      Some investigation documented in kernel bug #65281 led me to the
      conclusion that the source of the problem is the device_del() in
      pci_stop_dev() as it now causes the sysfs directory of the device to be
      removed recursively along with all of its subdirectories.  That includes
      the sysfs directory of the device's subordinate bus (dev->subordinate) and
      its "power" group.
      
      Consequently, when pci_remove_bus() is called for dev->subordinate in
      pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
      point the sysfs directory of bus->dev doesn't exist any more and its
      "power" group doesn't exist either.  Thus, when dpm_sysfs_remove() called
      from device_del() tries to remove that group, it triggers the above
      warning.
      
      That indicates a logical mistake in the design of
      pci_stop_and_remove_bus_device(), which causes bus device objects to be
      left behind their parents (bridge device objects) and can be fixed by
      moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
      pci_remove_bus() can be called for the device's subordinate bus before the
      device itself is unregistered from the hierarchy.  Still, the driver, if
      any, should be detached from the device in pci_stop_dev(), so use
      device_release_driver() directly from there.
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
      
      
      Reported-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      4bff6749
  15. Nov 14, 2013
  16. Apr 12, 2013
    • Jiang Liu's avatar
      PCI: Add pcibios hooks for adding and removing PCI buses · 10a95747
      Jiang Liu authored
      
      On ACPI-based platforms, the pci_slot driver creates PCI slot devices
      according to information from ACPI tables by registering an ACPI PCI
      subdriver.  The ACPI PCI subdriver will only be called when creating/
      destroying PCI root buses, and it won't be called when hot-plugging
      P2P bridges.  It may cause stale PCI slot devices after hot-removing
      a P2P bridge if that bridge has associated PCI slots.  And the acpiphp
      driver has the same issue too.
      
      This patch introduces two hook points into the PCI core, which will
      be invoked when creating/destroying PCI buses for PCI host and P2P
      bridges.  They could be used to setup/destroy platform dependent stuff
      in a unified way, both at boot time and for PCI hotplug operations.
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Signed-off-by: default avatarYijing Wang <wangyijing@huawei.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Myron Stowe <myron.stowe@redhat.com>
      10a95747
    • Jiang Liu's avatar
      PCI: When removing bus, always remove legacy files & unregister · 1e89d268
      Jiang Liu authored
      
      We always call device_register() and pci_create_legacy_files() for a
      new bus before handing out the "struct pci_bus *".  Therefore, there's
      no possiblity of removing the bus with pci_remove_bus() before those
      calls have been made, so we don't need to check "bus->is_added" before
      calling pci_remove_legacy_files() and device_unregister().
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Signed-off-by: default avatarYijing Wang <wangyijing@huawei.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      1e89d268
  17. Feb 13, 2013
  18. Jan 25, 2013
  19. Nov 03, 2012
  20. Sep 20, 2012
  21. Aug 22, 2012
  22. Jun 13, 2012
  23. Feb 27, 2012
  24. Feb 14, 2012
    • Yinghai Lu's avatar
      PCI: make sriov work with hotplug remove · ac205b7b
      Yinghai Lu authored
      When hot removing a pci express module that has a pcie switch and supports
      SRIOV, we got:
      
      [ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
      [ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
      [ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
      [ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
      [ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
      [ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
      [ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
      [ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
      [ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain:bus:device=0000:b0:00
      [ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
      [ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain:bus:dev = 0000:b0:00
      [ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
      [ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
      [ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
      [ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
      [ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
      [ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
      [ 5926.133377] PGD 0
      [ 5926.135402] Oops: 0000 [#1] SMP
      [ 5926.138659] CPU 2
      [ 5926.140499] Modules linked in:
      ...
      [ 5926.143754]
      [ 5926.275823] Call Trace:
      [ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
      [ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
      [ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
      [ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
      [ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
      [ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
      ...
      
       +-[0000:80]-+-00.0-[81-8f]--
       |           +-01.0-[90-9f]--
       |           +-02.0-[a0-af]--
       |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
       |           |                               |            +-00.1 Device
       |           |                               |            +-00.2 Device
       |           |                               |            \-00.3 Device
       |           |                               \-03.0-[b3]--+-00.0 Device
       |           |                                            +-00.1 Device
       |           |                                            +-00.2 Device
       |           |                                            \-00.3 Device
      
      root complex: 80:02.2
      pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
      end devices  are b2:00.0 and b3.00.0.
      VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3
      
      Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
      virt fn and remove the fn, so
      	list_for_each_safe(l, n, &bus->devices)
      will have problem to refer freed n that is pointed to vf entry.
      
      Solution is just replacing list_for_each_safe() with
      list_for_each_prev_safe().  This will make sure we can get valid n pointer
      to PF instead of the freed VF pointer (because newly added devices are
      inserted to the bus->devices list tail).
      
      During reviewing the patch, Bjorn said:
      |   The PCI hot-remove path calls pci_stop_bus_devices() via
      |   pci_remove_bus_device().
      |
      |   pci_stop_bus_devices() traverses the bus->devices list (point A below),
      |   stopping each device in turn, which calls the driver remove() method.  When
      |   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
      |   also uses pci_remove_bus_device() to remove the VF devices from the
      |   bus->devices list (point B).
      |
      |       pci_remove_bus_device
      |         pci_stop_bus_device
      |           pci_stop_bus_devices(subordinate)
      |             list_for_each(bus->devices)             <-- A
      |               pci_stop_bus_device(PF)
      |                 ...
      |                   driver->remove
      |                     pci_disable_sriov
      |                       ...
      |                         pci_remove_bus_device(VF)
      |                             <remove from bus_list>  <-- B
      |
      |   At B, we're changing the same list we're iterating through at A, so when
      |   the driver remove() method returns, the pci_stop_bus_devices() iterator has
      |   a pointer to a list entry that has already been freed.
      
      Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
      				 https://lkml.org/lkml/2012/1/23/360
      
      
      
      -v5: According to Linus to make remove more robust, Change to
           list_for_each_prev_safe instead. That is more reasonable, because
           those devices are added to tail of the list before.
      
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      ac205b7b
  25. Feb 10, 2012
    • Yinghai Lu's avatar
      PCI: Fix pci cardbus removal · 3682a394
      Yinghai Lu authored
      
      During test busn_res allocation with cardbus, found pci card removal is not
      working anymore, and it turns out it is broken by:
      
      |commit 79cc9601
      |Date:   Tue Nov 22 21:06:53 2011 -0800
      |
      |    PCI: Only call pci_stop_bus_device() one time for child devices at remove
      
      The above changed the behavior of pci_remove_behind_bridge that
      yenta_cardbus depended on.  So restore the old behavoir of
      pci_remove_behind_bridge (which requires stopping and removing of all
      devices) by:
      
      1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
         __pci_remove_bus_device() call it instead.
      2. add pci_stop_behind_bridge that will stop devices behind a bridge
      3. add back pci_remove_behind_bridge that will stop and remove devices
         under bridge.
      
      -v2: update commit description a little bit.
      
      Tested-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      3682a394
  26. Jan 06, 2012
  27. May 21, 2011
Loading