The majority of the scatterlist allocations used in KGSL are under 1
page (1 page of struct scatterlist is approximately 1024 entries
equalling 4MB of allocated buffer). In these cases using vmalloc
for the sglist is undesirable and slow. Add functions to check the
size of the allocation and favor kzalloc for 1 page allocations and
vmalloc for larger lists.
Change-Id: Ic0dedbad99b60111677dd56b74edd8cedcac17f0
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The 2d hardware handles ringbuffer and IB commands as
a series of gotos. At the end of each IB, there must
be a goto command back to the ringbuffer, which must
be "monkey patched" into the IB by the driver.
Fix this code to use a proper kernel mapping.
Change-Id: Ic35e6fbf6baeef51dbc2497f1702c7ccd6997579
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Replace vmalloc allocation with physical page allocation. For most
allocations we do not need a kernel virual address. vmalloc uses up
the kernel virtual address space. By replacing vmalloc with physical
page alloction and mapping that allocation to kernel space only
when it is required prevents the kgsl driver from using unnecessary
vmalloc virtual space.
Change-Id: Idc716c8366f837f06a61b154deacec65a3a0662e
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Separate ib parse checking from cffdump as it is useful
in other situations. This is controlled by a new debugfs
file, ib_check. All ib checking is off (0) by default,
because parsing and mem_entry lookup can have a performance
impact on some benchmarks. Level 1 checking verifies the
IB1's. Level 2 checking also verifies the IB2.
Change-Id: Ibf3c6d1e0d7522e75b41e1a6dbb92020ae9ace8d
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Add KGSL_IOCTL_SETPROPERTY to allow certain features to be enabled in
the kernel driver via userspace.
Change-Id: Ic0dedbadcbf3bfd451db947cec5d997261b12915
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The snapshot code was incorrectly parsing type0 packets in
indirect buffers which ended up with the snapshot code trying
to dump random values as valid GPU addresses. There were other
failsafes in place to make sure we didn't actually try to read
the memory, but it still made for a incomplete snapshot and lots
of annoying error messages.
Change-Id: Ic0dedbad200ce0170a70c45a613e9717ff86658b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Dump the entire ringbuffer to the snapshot and let the parser decide
what it wants to show. This is a lot better then trying to make those
sorts of decisions in the kernel. Even if we are dumping the entire
ringbuffer it still only makes sense to dump the IBs for the hanging
frame so do the math to find the context switch before the
last submitted IB and dump only the IBs from there to rptr or the
next context switch whichever is first.
Change-Id: Ic0dedbad3fed6be1fca3ed8a320386f70a562d43
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Add nop packets in ringbuffer at the start and end of IB buffers
subnmitted by user space driver. These nop packets serve as markers
that can be used during replay, recovery, and snapshot to get valid
data for a GPU hang dump
Change-Id: Id080672b7c04a1b6cfbccbcf5d4591cb5f0b3058
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
The list of memory objects attached to a process gets searched quite
a lot during normal operation of the driver. For processes with a
lot of memory allocations, the linear search through the list is O(N)
and uses a lot more CPU during critical loops than it should. Change
the mem entry list to a rbtree for faster search speeds.
Change-Id: Ic0dedbad1b25d9d77f56f93696b2fe933fbad333
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Create a separate kgsl pool to map global allocations made in kgsl
when IOMMU is used. Also, set the ttbr1 to point to defaultpagetable.
This allows us to map the kgsl allocations only once to the
defaultpagetable instead of having to map it to every pagetable
created.
Change-Id: I70fc051e852bf6820e09a1113a63ac93f7e0a51b
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
The pagetable pointer was checked against NULL after being used.
Check against NULL first and then dereference it.
Change-Id: I714de9e3b153f212cb92502a21c7d720dd4e1e37
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
This pwrscale policy provides per-core idle information to the
msm_dcvs driver. It accepts frequency updates from the msm_dcvs
driver and updates the core frequency as needed.
Change-Id: I201cfcb6ceedc19c27f7848781813d9c477f9f83
Signed-off-by: Lucille Sylvester <lsylvest@codeaurora.org>
Allowing gpu to operate on highest perf level on
wake-up for better performance.
Change-Id: I0fd678359bd29a27fb218a506d1b307544ec5aae
Signed-off-by: Nilesh Shah <nsshah@codeaurora.org>
Fix the state to requested state so that when both ISR and
timer fire at the same time, the state is set as SLEEP
Change-Id: Ibeeaa8e586481eef0143f3cdb16bb8273ba2cc80
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
msm8930 Phase3 bringup done with A305 GPU. So reverting workarounds done
for 8930 phase2 bringup with 8960 using A2xx GPU which are not valid now.
This is equivalent to revert of 8eea9cf9b0
Change-Id: Ib7b830a8fcaf45ca5e17f101061e841b03ee537f
Signed-off-by: Sudhakara Rao Tentu <srtentu@codeaurora.org>
Fix a typo in the a3xx_reg.h file and set the correct bit
in the A3XX power counter register to enable the GPU busy
counter for pwrctrl monitoring.
Change-Id: Ic0dedbaddbda0769073f76fa532552620638c630
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Add KGSL_DEV_ERR_ONCE macro to kgsl_log.h, with functionality
similar to pr_err_once() but output format of dev_err(). Add logging
for deprecated code usage, to kgsl_ioctl_sharedmem_from_vmalloc()
and KGSL_USER_MEM_TYPE_ADDR switch case from kgsl_ioctl_map_user_mem()
Change-Id: I43bbd5acfb4630b88170034d61f1d099fbe3f118
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Merge Upstream's stable 3.0.21 branch into msm-3.0
This consists 814 commits and some merge conflicts.
The merge conflicts are because of some local changes to
msm-3.0 as well as some conflicts between google's tree and
the upstream tree.
Conflicts:
arch/arm/kernel/head.S
drivers/bluetooth/ath3k.c
drivers/bluetooth/btusb.c
drivers/mmc/core/core.c
drivers/tty/serial/serial_core.c
drivers/usb/host/ehci-hub.c
drivers/usb/serial/qcserial.c
fs/namespace.c
fs/proc/base.c
Change-Id: I62e2edbe213f84915e27f8cd6e4f6ce23db22a21
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
Instead of mapping 1 4K page at a time into the IOMMU create a
scatterlist and map everything at once. This will be more efficient.
Change-Id: I8e83066869dd6f7a479bad22a66e4c70cc5973b5
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
The memory is allocated by vmalloc_user which initializes the memory
to zero however the initialization was being discarded by the cache
invalidate before the memory was returned to userspace.
Change-Id: I4d44fb3c308bb04c6cb137f7d34cd38600980f42
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Change the pm_qos vote to default when the diplay goes
off. This allows the cpu to do idle power collapse after
display goes off.
Change-Id: Id7c3af50e66c9deab483da98cac2569f56cd21e4
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
It turns out that A3XX_RBBM_HARDWARE_VERSION returns 0x0 for both A320
and A305. This, combined with some faulty logic in the GPU list, caused
A320 to be reported as a A305. This had the immediate effect of costing
A320 on apq8054 half the GMEM that it deserves and also triggering
instabilities in the user mode driver. Instead of trying to read multiple
registers to figure out the GPU ID, make the reasoned assumption that for
now at least, GPU ID will match SoC ID. Construct the chip_id based on the
SoC ID for A3XX targets and fix up the reported chip_id so it matches what
user space expects.
Change-Id: Ic0dedbadc74cb08fd7bc0bfb523b710ad33ed78c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Instead of a fixed 256MB virtual range for both the GPUMMU and IOMMU, make
the virtual range a property of the MMU engine and set the IOMMU range to
2GB. Technically we could go all the way up to 4G, but even 2G is far out
of the realm of possiblity in the current generation, and we wanted to
reserve some of the space for future enhancements.
Change-Id: Ic0dedbad2987beb162b6a1878dd65ffae8a78522
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Idlestats powerscale policy is required for userspace GPU DCVS.
This change sets it as default, so that the GPU DCVS daemon can
be started without having to set it first.
Change-Id: Ia280c9f685262b2848f1b85d74876f15a2e6ad6f
Signed-off-by: Lynus Vaz <lvaz@codeaurora.org>
Use default VBIF seetings for single SMMU in case of 8x30 and
configure VBIF settings for dual SMMU in case of 8064.
Change-Id: I0e9522eecc687615f285d905d8bd6ae4341595c9
Signed-off-by: Sudhakara Rao Tentu <srtentu@codeaurora.org>
Ion carveout and content protect heap buffers do not
have a struct page associated with them. Thus
sg_phys() will not work reliably on these buffers,
so set dma_address on their scatterlists.
CRs-Fixed: 345257
Change-Id: Ifdad5ce497de170f47b4ee2f7a93563a5cbe1a96
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Ion carveout and content protect heap buffers do not
have a struct page associated with them. Thus
sg_phys() will not work reliably on these buffers.
Set the dma_address field on physically contiguous
buffers. When mapping a scatterlist to the gpummu
use sg_dma_address() first and if it returns 0
then use sg_phys().
Change-Id: Ie5f19986446be4383dfbfffa2534136b592e8e46
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
The IOMMU driver takes a spinlock internally when mapping, so
do not take an additional spinlock when mapping to IOMMU table.
Change-Id: I772ffb09af95ed15dc2c3495affa9efd48e4af5b
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>